Cryogenic electron microscopy (cryo-EM) has become a widely used technique for determining the 3D structures of proteins. However, cryo-EM datasets often exhibit heterogeneity, with protein particle images from multiple conformations or compositional states. Here we proposeCryoNeRF, a novel neural radiance fields (NeRF)-based cryo-EM reconstruction framework operating directly in Euclidean 3D space. CryoNeRF introduces a multi-resolution hash encoding and heterogeneity-aware cryo-EM encoder to model cryo-EM heterogenity. Extensive experiments demonstrate the stability and superior performance of CryoNeRF in both homogeneous and heterogeneous settings. On homogeneous datasets, CryoNeRF achieves 15.8% improvement over previous state-of-the-art methods. On both simulated and experimental heterogeneous datasets, CryoNeRF demonstrates exceptional capability in handling both conformational and compositional variations, which is consistent with previous experimental discoveries. Notably, CryoNeRF successfully distinguishes assembly states that even only account for 2% particles of the dataset in cases of compositional heterogeneity.