Talk:Byzantine fault

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia

References to more work[edit]

This is a beginning, it needs references to more recent work (such as this osdi paper) and perhaps an explaination of the algorithms. It also needs some backlinks.

--Jsnow 22:09, 8 Sep 2004 (UTC)

Removed the External links part, as the only link in there was the same as the one in References. -- Olivier

Error in Paragraph Solutions[edit]

current "the Generals vote by treating each others' orders as votes." shouldnt it be that "the Generals vote by asking each of the direct subordinates or even lower ranks for comment where the amount of people asked depends on the 'velocity' of information exchange and processing."

16:18, 16 February 2006 (UTC) Jan Girke

Suggesting Merge[edit]

I'm suggesting that Byzantine failure and Byzantine fault tolerance be merged. Both pages primarily contain a description of the Byzantine Generals' Problem. -- ToastyKen 18:32, 16 October 2006 (UTC)[reply]

Agreed, I've merged them. There may need to be a little more cleanup. -- Bovineone 06:37, 15 December 2006 (UTC)[reply]

Origin section[edit]

The origin section currently contains two descriptions of the origin of the label "Byzantine". I've hacked the second of these out of the main article and placed it below. It seems the less-concise and weaker of the two, but others may disagree. Cheers, --Plumbago 09:20, 21 December 2006 (UTC)[reply]

The Byzantine Generals problem describes a group of generals, each commanding a division of the Byzantine army, encircling a city. These generals wish to formulate a plan for attacking the city. In its simplest form, the generals must only decide whether to attack or retreat. Some generals may prefer to attack, while others prefer to retreat. The important thing is that every general agrees on a common decision, for a halfhearted attack by a few generals would become a rout and be worse than a coordinated attack or a coordinated retreat.
The problem is complicated by the presence of traitorous generals who may not only cast a vote for a suboptimal strategy, they may do so selectively. For instance, if nine generals are voting, four of whom support attacking while four others are in favor of retreat, the ninth general may send a vote of retreat to a few generals, and a vote of attack to the rest. Those who received a retreat vote from the ninth general will retreat, while the rest will attack (which may not go well for the attackers).
  • Though on the other hand this version does contain more information that the other doesn't have. Mathmo Talk 09:05, 12 February 2007 (UTC)[reply]


Lamport explains the origin of the problem here and the origin of the name here. --TristramBrelstaff (talk) 16:12, 3 February 2008 (UTC)[reply]

It seems to me that the "Origin" section should be first, to give context to the problem. 207.171.191.60 (talk) 23:31, 14 June 2011 (UTC)[reply]

Motives for attacking[edit]

Such a nice little article! Can anyone list some of the mentioned motives for attacking P2P networks, so maybe it can be worked into the article?

Copyright concerns[edit]

This article was listed at the copyrights problem board on August 20th, here, as a potential infringement of this source. The editor who listed it there asserts that the "Third paragraph ("A Byzantine fault is ...") is a copy (with only slight rewording) of the last paragraph on the first page of Klempous, et al." The contributor who raised the question seems to be new, as this is his or her only contribution, and so he or she may not have understood the instructions on the page for blanking the section. I have followed through with that pending investigation. Unfortunately, I have no access to that material, but will seek out somebody who does to help clarify the matter. --Moonriddengirl (talk) 12:32, 7 September 2008 (UTC)[reply]

  • Resolved. It seems evident that Wikipedia cannot have infringed on that source with this text. That section of this article dates back to May of 2004, here. The identified source was first published in December of 2006. --Moonriddengirl (talk) 12:59, 7 September 2008 (UTC)[reply]

Examples of Byzantine Faults are odd[edit]

The paragraph starting "For example, if the output of one function is the input of another..." seems odd; a Byzantine fault can be any misbehavior at all of a node, in terms of when and if it transmits, and the contents and destination(s) of its transmission. Giving two examples, both of which are examples of local algorithms where a small error mushrooms into a large one, seems to suggest that this mushrooming is somehow typical or definitive of Byzantine faults. Which it isn't, is it? I am considering replacing this paragraph with a more straightforward statement that a Byzantine fault can be any misbehavior at all within the communication model, including faults intended to disrupt the system. If someone else would like to do that first, please feel free.  :) -- Orbst (talk) 22:21, 10 November 2008 (UTC)[reply]


Solutions Copied?[edit]

The entire text of the solutions seems to be verbatum the same as http://bft1.org/Links.php, which also links to the actual solutions (missing in the article). Could this be copying?

92.236.112.243 (talk) 12:48, 13 May 2009 (UTC)[reply]

I would suspect it copying from WP, actually. There is no robots.txt for that domain, and when I looked in the Internet Archive, that page does not show up, and indeed, nothing in that domain shows up. This suggests that it is less than 6-9 months old (the Google cache only dates back to 22 March), and I believe this page has been largely unchanged in that time span. --Gwern (contribs) 18:28 14 May 2009 (GMT)

n>3t?[edit]

"Many classic agreement problems, such as the Byzantine Generals' Problem, have no solution unless n > 3t, where n is the number of processes in the system"
I believe this is the case only for the oral messaging solution given in Lamport's paper - the signed solution can maintain tolerance for an arbitrary number of traitors.
--DarkCow (talk) 01:23, 19 March 2010 (UTC)[reply]

This discussion definitely needs work.

n>3t is also required for solutions in asynchronous system (a la Castro and Liskov's PBFT) (it would be good to track down a cite for the impossibility result). My recollection is that Lamport's signed protocol proceeds in synchronous rounds?

It may be easier to have this discussion make sense if it comes after some of the specific protocols are mentioned. Perhaps add a section at the end "Limits"? (and remove this discussion from the earlier section and perhaps shorten related discussions for some of the specific algorithms?)

Perhaps something like: "Many classic agreement problems, such as the Byzantine Generals' Problem, have no solution unless n > 3t, where n is the number of processes in the system. Lamport's first protocol above used unauthenticated messages [link -- cryptography or unauthenticated message?] and required n>3t, which is minimal [cite]. Lamport's second protocol above used cryptographically signed authenticated messages [link -- authenticated messages or cryptography?] and could tolerate any number of faulty generals, but it assumed a synchronous distributed system [link -- snchronous distributed system?] that bounds the time for correct nodes to exchange messages. Many recent protocols such as PBFT, QU, HQ, Zyzzyva, Aardvark, and UpRight are designed to operate in an asynchronous system [link -- asynchronous distributed system?]---ensuring correct operation without requiring timing assumptions but only guaranteeing progress when nodes can communicate in a timely fashion. Byzantine agreement in an asynchronous system requires n>3t, even with authenticated messages [cite].

--Md4567 (talk) 13:19, 19 March 2010 (UTC)[reply]

There are two errors in the article related to this. The FLP theorem says consensus can't be reached in any asynchronous setting if any processor fails. "Asynchronous" refers to either nodes or the network being "asynchronous". Node asynchrony means nodes are not synchronized by either logical sequence like Lamport timestamps or time (internal clock steps). Network asynchrony means the network has an unbounded propagation delay. 2nd error: Both classic papers with Lamport's algorithms in a synchronous setting show consensus can be reached with 2n+1, not 3n+1. Ywaz (talk) 12:36, 20 June 2019 (UTC)[reply]

Fiber optics cite?[edit]

The article states:

Byzantine refers to the Byzantine Generals' Problem, an agreement problem (first proposed by Marshall Pease, Leslie Lamport, and Robert Shostak in 1980[4])

Where '[4]' is "Julie K. Petersen (2003). CRC Press. ed. Fiber Optics Illustrated Dictionary. CRC Press. ISBN 9780849313493.". What does this have to do this Byzantine agreement? I suspect the article actually means to cite the Pease, Shostak and Lamport paper in JACM from 1980 ("Reaching agreement in the presence of faults"). Neilc (talk) 21:35, 8 November 2010 (UTC)[reply]

History[edit]

Lamport's description of the history of the use of the name "Byzantine" might be of some interest:
  http://research.microsoft.com/en-us/um/people/lamport/pubs/pubs.html#byz

Milkycow (talk) 18:01, 23 July 2011 (UTC)[reply]

Real-world application[edit]

Here's an article that indicates that the "Byzantine generals' algorithm" is used in all of the flight software of the triply-redundant computers running Linux used to control a wide variety of flight systems for SpaceX launch vehicles and spacecraft. ELC: SpaceX lessons learned, LWN.net, 6 Mar 2013. This is well beyond research; this is frequent applied technology use in the operation of space transportation systems. It is also used daily at SpaceX for a wide variety of ground tests of flight systems and flight prototype verification and validation testing. Cheers. N2e (talk) 20:17, 25 March 2013 (UTC)[reply]

Beginning of a major rewrite[edit]

Most of the existing text clearly was written by an editor or editors not well versed in the subject matter. Several problems previously have been identified by a number of people on this Talk page. As a recognized expert in the field, I've started a major rewrite. I'll do more when I have time. Some of the other language pages on this subject have some better content; which is odd given that all the cites are in English. This start of a major rewrite corrects the following problems:

  • Byzantine fault toleranceis not just a field of research; it has been put into practice for several decades now.
  • While the link to "fault-tolerant computer systems" is correct, it is slightly misleading because that page leans to fault tolerance of a single computer, whereas Byzantine fault tolerance is normally used in distributed systems
  • While many papers on Byzantine faults use the term arbitrary, this terminology is misleading When applied to Byzantine faults or failures. For example, a well-known Byzantine fault tolerance system was not able to tolerate an over-voltage power supply (no Byzantine fault tolerance mechanism could tolerate such a failure). Another class of faults for which Byzantine fault tolerance is ineffective is physical damage done to redundant components. If Byzantine fault tolerance can't tolerate many classes of failure, one cannot say Byzantine fault tolerance tolerates arbitrary failures. The arbitrary adjective might be correctly applied to the term errors.
  • The whole section "Failure modes" was deleted because it contained more errors than facts. Some examples:
    • Execution of algorithms or processes are not the only potential sources for Byzantine faults.
    • The misleading second paragraph has nothing to do with Byzantine failures.
    • All three sentences of the third paragraph had problems.
  • This deletion also moves the Origin section up, as suggested by a previous Talk contributor.
  • The reason given for the Byzantine Army being chosen as an example wasn't quite right.
  • The Byzantine Generals' attack scenario wasn't described well.
  • The "Early solutions" section should have mentioned the (c. 1980) pioneers: FTMP, MMFCS, and SIFT.
    • The first "Early solutions" bullet omits the crucial synchrony requirement
    • The second "Early solutions" bullet assumes sentient threats versus natural faults.
  • The opening assertion of the "Practical Byzantine fault tolerance" is wrong, as stated (again missed considering synchrony).
  • The implication that PBFT was the first practical Byzantine fault tolerance is wrong, without restricting what practical means (possibly restricting it to geographically-distributed asynchronous systems). Note that MMFCS (and probably FTMP and SIFT as well) were doing Byzantine fault tolerance with sub-millisecond added latency 20 years before PBFT. The standard ARINC 659 / SAFEbus® could do Byzantine fault tolerance with near microsecond added latency, nearly a decade before PBFT.

The Origin section is a merge and cleanup of a suggestion from the talk page and the previous text.

This page could use a figure like the one at de.wikipedia.org/wiki/Datei:ByzantineError.svg

The PBFT section seems to be over-weighted with respect the Byzantine fault tolerance subject are as a whole. Paxos is a more well known algorithm than most of those mentioned under the PBFT section. Paxos and other Byzantine fault tolerance algorithms, mechanisms, and implementations should be added to redress this imbalance.

A new section (e.g., the previously suggested "Limits" or "Tolerance Degree versus Resources") should be added that goes into detail on the required number of fault containment zones, rounds of exchange, and connectivity.

Should add a discussion on the fact that the terms synchronous and asynchronous have different definitions between what the majority of Byzantine papers use and what the rest of the real-time field uses.

Should include the '78 Davies and Wakerly paper.

The phrases interactive consistency and source congruency should probably redirect to Byzantine fault tolerance.

Path-logical (talk) 04:55, 5 March 2015 (UTC)[reply]

Bitcoin[edit]

I don't think bitcoin is a good example of Byzantine fault tolerance. I would suggest that it be removed. Yes, bitcoin solves certain simplified scenarios involving the BGP but there are too many baked in assumptions in the protocol for it to have expository value here. Pramod 16:07, 29 April 2015 (UTC) — Preceding unsigned comment added by Pramod.s (talkcontribs) And I strongly disagree. Most people associate the Byzantine problem with bitcoin. It'd be much better to state how they are related even if they need some assumptions, because that's what everybody wants to know nowadays. What's the use of an encyclopedia that does not have answers? — Preceding unsigned comment added by 112.161.5.152 (talk) 15:14, 23 September 2018 (UTC) The current statements on bitcoin do NOT give answers, it just hints towards a loose connection. Concretely: One formulation of BFT is with n generals, all of which have an input bit and want to agree on a common output bit. How is this solved in bitcoin? How/where do clients feed in their bit inputs? The bitcoin-BFT connection deserves a much better explanation, or complete removal. — Preceding unsigned comment added by 2001:620:20:222:FFFF:0:9260:4212 (talk) 14:53, 10 July 2019 (UTC)[reply]

distributed k-agreement[edit]

Are "distributed agreement" and "distributed k-agreement" (as mentioned on pages such as rendezvous hashing) the same as, or specific kinds of, Byzantine agreement?

If so, those phrases should redirect to this "Byzantine fault tolerance" article. If not, what article should those phrases should redirect to? --DavidCary (talk) 13:45, 20 July 2015 (UTC)[reply]

External links modified[edit]

Hello fellow Wikipedians,

I have just modified one external link on Byzantine fault tolerance. Please take a moment to review my edit. If you have any questions, or need the bot to ignore the links, or the page altogether, please visit this simple FaQ for additional information. I made the following changes:

When you have finished reviewing my changes, please set the checked parameter below to true or failed to let others know (documentation at {{Sourcecheck}}).

This message was posted before February 2018. After February 2018, "External links modified" talk page sections are no longer generated or monitored by InternetArchiveBot. No special action is required regarding these talk page notices, other than regular verification using the archive tool instructions below. Editors have permission to delete these "External links modified" talk page sections if they want to de-clutter talk pages, but see the RfC before doing mass systematic removals. This message is updated dynamically through the template {{source check}} (last update: 18 January 2022).

  • If you have discovered URLs which were erroneously considered dead by the bot, you can report them with this tool.
  • If you found an error with any archives or the URLs themselves, you can fix them with this tool.

Cheers.—cyberbot IITalk to my owner:Online 22:53, 24 May 2016 (UTC)[reply]

Edit of fundamental hurdle of Byzantine fault tolerance. Am I correct? Please confirm or refute![edit]

Article stated: "Byzantine fault tolerance can be achieved if the loyal (non-faulty) generals have a unanimous agreement on their strategy."

My edit changes unanimous to majority, fundamentally changing the definition: "Byzantine fault tolerance can be achieved if the loyal (non-faulty) generals have a majority agreement on their strategy."

It seems to me that a majority is good enough (unanimous being the most complete form of majority but no necessary as a simple majority makes misdirection pointless). I post here in the hope that smarter people will check my instinct and affirm it is correct. I'm not a bitchain expert, and so excuse me if I am wrong!

Hard to understand[edit]

The opening sentence is byzantine. 5.34.26.87 (talk) 03:38, 28 December 2017 (UTC)[reply]

January 2018 tag - inadequate lead paragraph[edit]

I believe that the opening paragraph fails to follow the guidance given in the manual of style for lead paragraphs: https://en.wikipedia.org/wiki/Wikipedia:Manual_of_Style/Lead_section#Opening_paragraph It doesn't concisely define, introduce, or provide context for the reader. Only a specialist could possibly understand it, but a specialist would have no need to read the article, therefore it is useless to everybody. I would try to fix it up myself, but I have absolutely no idea what Byzantine fault tolerance is, as the Wikipedia article fails to concisely explain it... Would appreciate if someone who knows about the topic could write a concise definition that provides appropriate context and introduction for the general reader /CrypticPeepShowReference 203.26.125.101 (talk) 04:39, 18 January 2018 (UTC)[reply]

You are right about the second paragraph of the introduction, which relies on the arcane term "node crash." However, the first paragraph made Byzantine fault tolerance quite clear to me, and I'm a layman who knows nothing about computer theory and never heard this term before opening its article. I'll relegate the second paragraph to the Background section and remove the tag. --Panem circenses (talk) 21:57, 9 February 2018 (UTC)[reply]

Introduction missing important nuance: Byzantine faults can be malicious and intelligent[edit]

The introduction explains that Byzantine nodes are "unreliable", but I've understood Byzantine faults to be worse than that: the nodes are malicious, can even have their own secret communication channels, and are actively trying to undermine the stability of the system. I see some of this later on in the article, but it seems to be lost in the introduction. — Preceding unsigned comment added by Campkeith (talkcontribs) 00:30, 22 February 2018 (UTC)[reply]

Cultural Appropriation[edit]

"... that "Albanian" might smack of what would now be called cultural appropriation ..."

Is that the correct use of the term "cultural appropriation" ? WP defines it as " ...the adoption of elements of one culture by members of another culture ...". That doesn't seem to fit here at all. No element of Albanian culture is being adopted and no Albanians are adopting anything. It doesn't seem to make sense.

In my opinion we see a totally needless possible misrepresentation of hypothetical Albanian generals by a person of another culture. I have no idea what the term for this might be or if there even is one. But it is most certainly not "cultural appropriation". Always bad to blindly insert buzzwords ... or maybe I just don't understand something. JB. --92.193.129.164 (talk) 11:40, 5 February 2019 (UTC)[reply]

WTF are you talking about?[edit]

"failures such as incorrect voltages can propagate through the encryption process."

What!? Software, mathematics, data, that's one domain (computing). Voltages are another (electronics). A computer doesn't even have to be electrical. But how can an "incorrect voltage" affect software? In reality that might be expressed as a bit being set wrong but from then on that bit becomes part of the maths, data, computing. Voltages don't propagate through software, capacitance isn't measured in kilobytes, and you can't put machine code through a Pi filter.

It's nonsense. It's not even an argument. Why does this article contain utter nonsense? Who's writing it and who's responsible for stopping them? How much else of this article is utter rubbish? Did 6 people who knew what they were talking about try to come together on an article, but one of them was a traitor, and also his computer was plumbed into a washing machine?

84.65.94.92 (talk) 07:38, 16 January 2022 (UTC)[reply]