11/24 Have a stream of data (latency numbers for webpage requests).
Need the running median, approximate ok. How do I get this
without storing all the data?
\_ Are you the dude asking about 'variability' earlier?
\_ Make 'bins', say 0-10ms, 11-20ms, 21-30ms, ... and store how many
samples fall into each bin. You can then find the median within
<binsize> accuracy and it will use <max-latency>/<binsize> ints.
Latencies greater than <max-latency> can be put in their own special
bin but don't forget them when finding the median.
\_ Thanks. -op
Now, assuming you have this setup and know the true average, how do
you find the standard deviation?
\_ Try R: http://www.r-project.org |