RT @_jasonwei: It's not immediately obvious why LM scaling plots use a log-scale x-axis, and as a result some people think that "emergent abilities" are not real and just an artifact of the log-scale x-axis.
A quick post debunking that:
1. One reason for a log-scale x-axis is that models we… https://t.co/cP44L9n1Tr