A(Availability) : 몇몇 노드 다운이 다른 정상 노드들이 작동하는데 악영향을 끼치지 말아야 한다.
P(Partition Tolerance) : 몇몇 메시지 손실에도 시스템은 정상 동작을 해야 한다.
ex) 예제 2
Consistency (일관성) : 모든 노드들은 동시에 같은 데이터를 보아야 합니다.
Availability (유효성) : 모든 노드는 항상 읽기와 쓰기를 할 수 있어야 합니다.
Partition Tolerance (파티션 허용차) : 시스템은 물리적인 네트워크 파티션을 넘어서도 잘 동작하여야 합니다
음....잘 이해가 가지 않는 부분이 많았다..ㅠ.ㅠ..
그래서 영문을 번역해서 스스로 알아야겠다고 결심을 하여 아래의 2곳의 영문사이트에서 발췌 해보았다.
최근 RDBMS와 어깨를 나란히 하는 데이터베이스였다!! 데이터인데 어떤 기준? 어떤것을 기반으로 하는 데이터들을 처리하기 위해 나온 것인가 하면?! 아래와 같이 CAP이론에 바탕을 둔다. 물론 RDBMS도 마찬가지로! CA를중점으로 둔다고 한다. RDBMS 경우!
Consistency(all nodes see the same data at the same time)
=> ACE-T 해석 : 모든 노드들은 보아야한다 같은 데이타를 같은 시간에!!
Availability(a guarantee that every request receives a response about whether it was successful or failed)
=> ACE-T 해석 : 보장!! 댓 이하의를 모든 요청들은 응답을 받아라 성공인지 실패인지에 대해!
Partition tolerance(the system continues to operate despite arbitrary message loss or failure of part of the system)
=> ACE-T 해석 : 모든 시스템은 지속적인 동작을 해야한다. 임의적인 메시지를 잃거나 시스템의 부분이 실패되어도!!
The three requirements are: Consistency, Availability and Partition Tolerance, giving Brewer's Theorem its other name - CAP.
To give these some real-world meaning let's use a simple example: you want to buy a copy of Tolstoy's War & Peace to read on a particularly long vacation you're starting tomorrow. Your favourite web bookstore has one copy left in stock. You do your search, check that it can be delivered before you leave and add it to your basket. You remember that you need a few other things so you browse the site for a bit (have you ever bought just one thing online? Gotta maximise the parcel dollar). While you're reading the customer reviews of a suntan lotion product, someone, somewhere else in the country, arrives at the site, adds a copy to their basket and goes right to the checkout process (they need an urgent fix for a wobbly table with one leg much shorter than the others).
Consistency
A service that is consistent operates fully or not at all. Gilbert and Lynch use the word "atomic" instead of consistent in their proof, which makes more sense technically because, strictly speaking, consistent is the C in ACID as applied to the ideal properties of database transactions and means that data will never be persisted that breaks certain pre-set constraints. But if you consider it a preset constraint of distributed systems that multiple values for the same piece of data are not allowed then I think the leak in the abstraction is plugged (plus, if Brewer had used the word atomic, it would be called the AAP theorem and we'd all be in hospital every time we tried to pronounce it).
In the book buying example you can add the book to your basket, or fail. Purchase it, or not. You can't half-add or half-purchase a book. There's one copy in stock and only one person will get it the next day. If both customers can continue through the order process to the end (i.e. make payment) the lack of consistency between what's in stock and what's in the system will cause an issue. Maybe not a huge issue in this case - someone's either going to be bored on vacation or spilling soup - but scale this up to thousands of inconsistencies and give them a monetary value (e.g. trades on a financial exchange where there's an inconsistency between what you think you've bought or sold and what the exchange record states) and it's a huge issue.
We might solve consistency by utilising a database. At the correct moment in the book order process the number of War and Peace books-in-stock is decremented by one. When the other customer reaches this point, the cupboard is bare and the order process will alert them to this without continuing to payment. The first operates fully, the second not at all.
Databases are great at this because they focus on ACID properties and give us Consistency by also giving us Isolation, so that when Customer One is reducing books-in-stock by one, and simultaneously increasing books-in-basket by one, any intermediate states are isolated from Customer Two, who has to wait a few milliseconds while the data store is made consistent.
Availability
Availability means just that - the service is available (to operate fully or not as above). When you buy the book you want to get a response, not some browser message about the web site being uncommunicative. Gilbert & Lynch in their proof of CAP Theorem make the good point that availability most often deserts you when you need it most - sites tend to go down at busy periods precisely because they are busy. A service that's available but not being accessed is of no benefit to anyone. 음...대충 읽어보면..cap이론 서비스라는게 바빠서 엑세스가 안되고 다운될 수도 있는데..서비스는 이용가능해야한다고 하는것 같다! 즉 응답이 있어야 한다. not ~uncommunicative
Partition Tolerance
If your application and database runs on one box then (ignoring scale issues and assuming all your code is perfect) your server acts as a kind of atomic processor in that it either works or doesn't (i.e. if it has crashed it's not available, but it won't cause data inconsistency either).
Once you start to spread data and logic around different nodes then there's a risk of partitions forming. A partition happens when, say, a network cable gets chopped, and Node A can no longer communicate with Node B. With the kind of distribution capabilities the web provides, temporary partitions are a relatively common occurrence and, as I said earlier, they're also not that rare inside global corporations with multiple data centres.
Gilbert & Lynch defined partition tolerance as:
No set of failures less than total network failure is allowed to cause the system to respond incorrectly
and noted Brewer's comment that a one-node partition is equivalent to a server crash, because if nothing can connect to it, it may as well not be there.
위의 사이트는 nahurst라는 개발자 블로그 인데 NoSQL의 CAP이론을 비주얼하게 나타내는 그림을 만들어놨다.
이 사람도 자기가 학습하고 만든거 같아서 100% 확신하는건 아니지만 참고 할만한 사이트임에는 틀림 없다.
이상한거 있으면 알려달라고 한다..고치게..(fix..) --ㅋㅋㅋㅋ
즉, 이론으로 공식적인 내용은 아니라는 것이다. 걍 우리네들 처럼 유명한 프로그래머 일지도...
누군지는 잘모르겠다^0^;; 그림은 아래와 같다.