Here's a scalar-valued function, which would do such a translation:
Points of interest:
- The function takes
@seed
as a parameter, which should be the same for all IDs in the batch. You may, and probably should, get a new random seed every time you start processing a new batch of IDs, in my case generating stats for a new survey; - The maximum number of distinct characters that can be used for encoding currently is 62 – upper and lower case Latin characters and digits. To further extend this approach, one can either append more encoding characters to the set or modify the logic to encode user IDs with 2-character sequences. This would give 52^2 > 6k possible encodings for just using Latin characters, which is more than enough for all practical applications.
Here're two examples of the same stats encoded in two separate batches.
Barch 1:
Same data in batch 2:
As you can see, the encoding characters were randomly changed for the second batch.
No comments:
Post a Comment