Scaling Dense Linear Algebra on Multicore and Beyond: a Survey