Go faster with Go: Golang for ML Serving
So the ask is to do 3 Million Predictions per second with as little resources as possible. Thankfully its one of the simpler model of Recommendation systems, Multi Armed Bandit(MAB). Multi Armed bandit usually involves sampling from distribution like Beta Distribution. That’s where the most time is spent. If we can concurrently do as many sampling as we can, we’ll use the resources well. Maximizing Resource utilization is the key to reducing overall resources needed for the model....