Created by Mali Akmanalp / @makmanalp
Mali Akmanalp
Platform / Data Engineer
@Custommade
Simple, right?
let's say you have a database of silly cats ...
cat1, cat2, cat3 ...
CREATE TABLE kittens (
`id` BIGINT AUTO_INCREMENT,
PRIMARY KEY(`id`)
);
But when you start getting more data ...
But now all your nodes have to talk
to (and maybe wait on) each other!
Still goes over the net
More infrastructure
SPOF (kinda)
>>> import uuid
>>> uuid.uuid4()
UUID('74691173-e69d-440d-8172-dd63c97d1e87')
standard
great language / db support
but ...
236abc75-f7e5-11e2-bc8a-b88d1204f9a2
Number of 100-nanosecond intervals since the adoption of the Gregorian calendar in the West.
236abc75-f7e5-11e2-bc8a-b88d1204f9a2
3 2 1
10765432100123456789
Timestamp + Machine ID + Sequence Number
more infrastructure ...
I'm pretty sure our ops guy Wes time travels to handle what's already on his plate.
Simplify, simplify, simplify your ID generation scheme.-- Thoreau
10765432100123456789
Timestamp (41b) + Random Number (23b)
>>> from simpleflake import simpleflake
>>> simpleflake()
3594162604452825250L
Easy as pie
>>> from simpleflake import parse_simpleflake
>>> parse_simpleflake(3594162604452825250L)
SimpleFlake(timestamp=1375160370.606, random_bits=6768802L)
Chances of collision at 100 inserts / sec.
1.0787 x 10^-9
At avg. 100 inserts / second, chances you'll get at least two in the same millisecond:
PDF[PoissonDistribution[0.1], 2]
= 0.00452419
For two requests in the same millisecond, chances you'll get the same number out of 2^23:
n^2/2m = (2)^2/(2*2^23)
= 2.3842 × 10^-7
Totally backwards compatible with snowflake.
But they're a pain ...