Learning visual features under motion invariance