Memory-augmented Transformers can implement Linear first-Order Optimization arxiv.org 1 points by PaulHoule 5 hours ago