Elliot's Super Computing Duper Programmer: 2018

2018년 8월 13일 월요일

UUID 64

UUID 64Bit

UUID generation is very important. It must be a truly unique identifier. Often, auto increment in an RDBMS is used to generate unique IDs, but this slows down as the number of rows grows, and slows down with increased access, and can cause single points of failure.
If using RDBMS across multiple shards instead of a single machine, unique ID generation should be different to avoid collisions. People want to use 128-bit UUIDs to avoid this, but 128 bits is too much overhead for most RDBMS where 64 bits is the limit for big ints before performance issues.
Unless performance is not a concern, 128-bit UUIDs are recommended. Doesn't Cassandra use uuid and timeuuid as default ID generation? Yet performance remains great.
So a 64-bit version can be created instead. Like timeuuid, have the timestamp in the top bits so later IDs come after earlier ones.
As long as the ID generator (Host) is different, collisions can be avoided, with 64K unique IDs per second possible. This should be sufficient for most use cases.
Here is example Go code implementing a 64-bit UUID version used successfully in production. The key is combining Timestamp (32 bits) | HostID (16 bits) | Atomic Sequence Number (16 bits) to generate unique IDs while avoiding collisions.

UUID 생성은 매우 중요하다.
말그대로 Unique Identifier가 되어야 하기 때문이다.

보통은 unique id생성을 위해서 RDBMS의 auto increment를 이용하는 경우가 많은데, 이것은 row수가 많아질수록 느리고, 많은 access가 생길수록 느려지고, single point of failure의 원인이 된다.

RDBMS를 single machine으로 쓰는 경우가 아니라, multiple shards환경에서 sharding을 하는 경우라면, unique id 생성을 다르게 해줘야 한다. id collision을 피해야 하기 때문이다.

uuid를 그래서 쓰고 싶어하지만, 128비트라서 부담스럽다.

보통의 RDBMS에서는 64bit가 big int로 지원되고, 그 이상은 퍼포먼스에 문제가 생길 수 있다. 퍼포먼스에 그다지 민감하지 않다면, 그냥 uuid 128 bit를 쓰길 권한다. Cassandra의 경우도uuid,timeuuid 를 기본적으로 id 생성으로 쓰지 않는가? 그래도 성능은 좋기만 하다.

그래서, 64 bit 버젼을 만들어서 쓴다.
timeuuid 처럼 timestamp값이 상위비트에 들어 있어서, 나중에 생긴 id가 뒤에 오도록 배려할 수 있다.
id 생성의 주체(Host)만 다르다면, collision을 피할 수 있고, 초당 64k개의 unique id를 생성할 수 있다.
이 정도면 충분하게 사용할 수 있다.

다음은 golang으로 구현한 uuid 64비트 버젼의 구현 코드이다.
실전에서도 잘 쓰이고 있다.

핵심은 Timestamp(32bit) | HostID(16Bit) | Atomic Sequence Number(16Bit) 으로 조합하여, 충돌을 피하면서, unique id를 생성하는 것이다.

---start of the code---
---end of the code---

2018년 8월 5일 일요일

Strength In Erlang

When working on projects not using Erlang, the toughest thing is releasing without complete unit tests, integration tests, and stress tests for all APIs.
Erlang projects also need those tests before release.
But a huge plus is updating without service interruption after issues arise. Especially great for socket services needing to maintain session context!
So despite lower computation speed than C++ servers, Erlang can't be beat. And it's not drastically slower - for I/O heavy work differences are small. With excellent multi-core support, Erlang can even outperform servers lacking it.
A paper comparing Go, Erlang, and Akka is a must-read!
Recommendations:
Use Erlang for socket services maintaining session context.
For HTTP REST services Erlang isn't critical.
Despite advantages, Erlang lacks explosive popularity due to difficult syntax and small community. Elixir helps but isn't sufficient yet and feels awkward to existing Erlang devs. Devs new or scared of Erlang can use Elixir.

Coverage Tests, Work load tests Before Release

실전에서 Erlang으로 서버를 만들지 않은 프로젝트를 접했을때,
가장 난감한 점은
"모든 API에 대한 유닛테스트와 전체 서비스에 대한 통합 테스트,그리고 스트레스 테스트를 반드시 통과해야 하지 않고서는 서비스 오픈이 겁난다"이다.

Erlang으로 했을 때에도, 유닛테스트,통합 테스트,스트레스 테스트를 하지 않는 것은 아니다.

Hot Code Reloading

그러나, 오픈후에 문제 상황 발생시에, 곧바로 원인을 파악하고, 서비스를 중단하지 않으면서 업데이트를 할 수 있다는 점은 엄청난 장점이 아닐 수 없다.
Session Context를 유지해야 하는 Socket 기반 서비스에서는 특히 그 장점이 빛이 난다.

Performance

따라서, C++ 서버(Native Binary를 지원하는)에 비해서 떨어지는 Computation 성능에도 Erlang을 선택하지 않을 수 없다.
그렇다고, 특별히 현저하게 성능이 떨어지는 것도 아니다. I/O bound job이 많은 경우에는 차이가 별로 나지 않을 뿐더러, Multi Core Support를 매우 훌륭하게 지원하기 때문에, Multi Core Support를 제대로 지원하지 않은 서버에 비하여 오히려 성능이 매우 우수하다.

Go , Erlang , Akka 를 비교한 논문이 있다. 꼭 읽어보길 바란다.

http://www.dcs.gla.ac.uk/~trinder/papers/sac-18.pdf

Recommendation

* Session Context를 유지해야 하는 Socket 기반 서비스에는 Erlang으로 구현한다.

* 그렇지 않은 http REST 기반 서비스는 Erlang이 아니어도 상관없을 듯 하다.

* 이러한 장점에도 불구하고, Erlang이 폭발적인 인기를 끌고 있지 못하는 이유는, 어려운 문법과 작은 커뮤니티일 것이다. Elixir가 그 단점을 커버하기 시작했지만, 충분하지 않고, 기존의 Erlang 개발자에게는 오히려 불편하다. Ruby 경험자나 Erlang을 처음 접하기 겁나는 개발자는 Elixir로 구현해도 될 듯하다.

2018년 8월 3일 금요일

Lead Programmer, Chief Architect , Chief Technology Officer

It's been a while since I posted. I've worked on various projects recently. The development went well but market success was limited.
Looking back at my programmer career, I've completed many projects and gained much experience. I rarely make mistakes now and can make timely decisions.
I used to admire the Chief Architect role. Now I understand what it entails, as it describes my current work.
After 33 years of coding and 25 years professionally, once a Lead Programmer develops skills evaluating practices, architectural insights, and thoughtful choices over hard work, they near the Architect role.
The difference from CTO is being more hands-on. CTO is harder - you must influence business decisions with technical expertise. Both political equity and tech credit are crucial.
I'm still learning and gaining new perspectives. Open source creativity impresses me. I want to share good practices so developers create quality software without struggle.
I aim to explain good practices simply, give back inspiration received, and attract more talented developers to our company and team. With that spirit, I'm starting to blog again.
The image could show the progression from a junior coder to lead programmer to architect/CTO, with increasing skills and wisdom gained over time. The senior architect is sharing knowledge back with younger coders, completing the cycle.

간만에 포스팅이다.

그간 여러 프로젝트를 했었다.

개발은 성공했다고 보지만, 시장에서의 성과는 별로였다.

프로그래머로 살아온 그간의 세월들에서 완성한 프로젝트들을 세어보니 정말 많았다.

많이 겪었다.

이제는 거의 실수를 안 한다.

시간에 맞는 선택도 할 줄 안다.

Chief Architect가 멋있어 보이기만 하던 시절이 있었다. 막연히 간지나는 역할을 해보고 싶다는 생각도 했었다.

이제는 그 역할이 뭔지 알 것 같다.

요즘 내가 하는 일이 그것인 것 같다.

코딩을 시작한지는 33년만이고, 소프트웨어 개발로 돈을 벌기 시작한지 25년만이다.

Lead Programmer가 경험을 쌓고, 좋은 프랙티스와 나쁜 프랙티스를 평가할 수 있고, 프로그램의 구조를 통찰할 수 있는 능력이 쌓이고, 각성하면, 열심히 하는 것보다 좋은 선택이 뭔지 보이게 되었다면, Architect에 가까와 진 것이다.

Chief Technology Officer 과 다른 점은 실무에 더 가깝다고 생각한다.

사실 CTO가 더 어렵다. 왜냐하면, 경영의 의사결정에 기술적 의사결정을 반영시킬 수 있어야 CTO라고 생각한다. 회사에서의 정치적인 지분과 기술적인 의사결정에 대한 크레딧이 필수적이다.

아직도 배우고 익히면서 새로운 통찰력을 얻는다.
오픈소스의 크리에이티브에 감탄한다.

좋은 프랙티스는 소개하고 공유하고 싶다.

개발자들이 고생하지 않으면서 좋은 품질의 소프트웨어를 개발하게 되길 원한다.

쉬운 말로 좋은 프랙티스를 공유하고 싶고, 내가 받은 영감을 되돌려 주고 싶다.

우리 회사와 우리 팀에 좋은 개발자들이 더 많이 모였으면 좋겠다.

그런 마음으로 다시 블로깅을 시작해보려고 한다.