We use a 3D CNN with encoder-decoder structure. The encoder part is based on Resnet with attention modules, the decoder part is consists of atrous convolution layer with different sizes.