[NoSQL] CAP Theorem

처음엔..NoSQL이 데이터를 처리 할 때 쓰이는 스크립트 언어인줄 알았다..

그래서 학습을 하다보니 NoSQL을 하기 선지식으로 CAP이론을 언급하고 있었다.

그래서 찾다보니...CAP가..정말 다양하게...@.@....

ex) 예제 1

ex) 예제 2

음....잘 이해가 가지 않는 부분이 많았다..ㅠ.ㅠ..

그래서 영문을 번역해서 스스로 알아야겠다고 결심을 하여 아래의 2곳의 영문사이트에서 발췌 해보았다.
최근 RDBMS와 어깨를 나란히 하는 데이터베이스였다!! 데이터인데 어떤 기준? 어떤것을 기반으로 하는 데이터들을 처리하기 위해 나온 것인가 하면?! 아래와 같이 CAP이론에 바탕을 둔다. 물론 RDBMS도 마찬가지로! CA를중점으로 둔다고 한다. RDBMS 경우!

출처 : http://en.wikipedia.org/wiki/CAP_theorem

CAP란?

Consistency (all nodes see the same data at the same time)
=> ACE-T 해석 : 모든 노드들은 보아야한다 같은 데이타를 같은 시간에!!
Availability (a guarantee that every request receives a response about whether it was successful or failed)
=> ACE-T 해석 : 보장!! 댓 이하의를 모든 요청들은 응답을 받아라 성공인지 실패인지에 대해!
Partition tolerance (the system continues to operate despite arbitrary message loss or failure of part of the system)
=> ACE-T 해석 : 모든 시스템은 지속적인 동작을 해야한다. 임의적인 메시지를 잃거나 시스템의 부분이 실패되어도!!

http://www.julianbrowne.com/article/viewer/brewers-cap-theorem

The three requirements are: Consistency, Availability and Partition Tolerance, giving Brewer's Theorem its other name - CAP.

To give these some real-world meaning let's use a simple example: you want to buy a copy of Tolstoy's War & Peace to read on a particularly long vacation you're starting tomorrow. Your favourite web bookstore has one copy left in stock. You do your search, check that it can be delivered before you leave and add it to your basket. You remember that you need a few other things so you browse the site for a bit (have you ever bought just one thing online? Gotta maximise the parcel dollar). While you're reading the customer reviews of a suntan lotion product, someone, somewhere else in the country, arrives at the site, adds a copy to their basket and goes right to the checkout process (they need an urgent fix for a wobbly table with one leg much shorter than the others).

Consistency

A service that is consistent operates fully or not at all. Gilbert and Lynch use the word "atomic" instead of consistent in their proof, which makes more sense technically because, strictly speaking, consistent is the C in ACID as applied to the ideal properties of database transactions and means that data will never be persisted that breaks certain pre-set constraints. But if you consider it a preset constraint of distributed systems that multiple values for the same piece of data are not allowed then I think the leak in the abstraction is plugged (plus, if Brewer had used the word atomic, it would be called the AAP theorem and we'd all be in hospital every time we tried to pronounce it).

In the book buying example you can add the book to your basket, or fail. Purchase it, or not. You can't half-add or half-purchase a book. There's one copy in stock and only one person will get it the next day. If both customers can continue through the order process to the end (i.e. make payment) the lack of consistency between what's in stock and what's in the system will cause an issue. Maybe not a huge issue in this case - someone's either going to be bored on vacation or spilling soup - but scale this up to thousands of inconsistencies and give them a monetary value (e.g. trades on a financial exchange where there's an inconsistency between what you think you've bought or sold and what the exchange record states) and it's a huge issue.

We might solve consistency by utilising a database. At the correct moment in the book order process the number of War and Peace books-in-stock is decremented by one. When the other customer reaches this point, the cupboard is bare and the order process will alert them to this without continuing to payment. The first operates fully, the second not at all.

Databases are great at this because they focus on ACID properties and give us Consistency by also giving us Isolation, so that when Customer One is reducing books-in-stock by one, and simultaneously increasing books-in-basket by one, any intermediate states are isolated from Customer Two, who has to wait a few milliseconds while the data store is made consistent.
Availability

Availability means just that - the service is available (to operate fully or not as above). When you buy the book you want to get a response, not some browser message about the web site being uncommunicative. Gilbert & Lynch in their proof of CAP Theorem make the good point that availability most often deserts you when you need it most - sites tend to go down at busy periods precisely because they are busy. A service that's available but not being accessed is of no benefit to anyone. 음...대충 읽어보면..cap이론 서비스라는게 바빠서 엑세스가 안되고 다운될 수도 있는데..서비스는 이용가능해야한다고 하는것 같다! 즉 응답이 있어야 한다. not ~uncommunicative
Partition Tolerance

If your application and database runs on one box then (ignoring scale issues and assuming all your code is perfect) your server acts as a kind of atomic processor in that it either works or doesn't (i.e. if it has crashed it's not available, but it won't cause data inconsistency either).

Once you start to spread data and logic around different nodes then there's a risk of partitions forming. A partition happens when, say, a network cable gets chopped, and Node A can no longer communicate with Node B. With the kind of distribution capabilities the web provides, temporary partitions are a relatively common occurrence and, as I said earlier, they're also not that rare inside global corporations with multiple data centres.

Gilbert & Lynch defined partition tolerance as:

No set of failures less than total network failure is allowed to cause the system to respond incorrectly

and noted Brewer's comment that a one-node partition is equivalent to a server crash, because if nothing can connect to it, it may as well not be there.

http://blog.nahurst.com/visual-guide-to-nosql-systems

위의 사이트는 nahurst라는 개발자 블로그 인데 NoSQL의 CAP이론을 비주얼하게 나타내는 그림을 만들어놨다.
이 사람도 자기가 학습하고 만든거 같아서 100% 확신하는건 아니지만 참고 할만한 사이트임에는 틀림 없다.
이상한거 있으면 알려달라고 한다..고치게..(fix..) --ㅋㅋㅋㅋ
즉, 이론으로 공식적인 내용은 아니라는 것이다. 걍 우리네들 처럼 유명한 프로그래머 일지도...
누군지는 잘모르겠다^0^;; 그림은 아래와 같다.

정말 참고 할 만한 사이트들~!!!

http://www.julianbrowne.com/article/viewer/brewers-cap-theorem

http://blog.nahurst.com/visual-guide-to-nosql-systems

http://nosql-database.org/

http://blog.outsider.ne.kr/519 / http://blog.outsider.ne.kr/520

http://www.vineetgupta.com/2010/01/nosql-databases-part-1-landscape/

http://www.pearltrees.com/#/N-s=1_5827086&N-f=1_5827086&N-fa=5793524&N-u=1_752336&N-p=54609535

사내 역량강화 발표 준비를 하면서 참고한 사이트들이다.

저작자표시 비영리 변경금지 (새창열림)

Developer 태하팍

[NoSQL] CAP Theorem

티스토리툴바