The raw contigs FASTA file, the GTF annotations and the peptide sequences were downloaded from: https://www.rosaceae.org The raw genome was sorted with: perl -lne 'if(/^(>.*)/){ $head=$1 } else { $fa{$head} .= $_ } END{ foreach $s (sort(keys(%fa))){ print "$s\n$fa{$s}\n" }}' The hard-masked FASTA file was produced with WindowMasker by Francesc Montardit at https://genomevolution.org/coge and then converted and sorted with the following Perl script: perl -lne 'if(!/^>/){ s/[Xx]/N/g } print' Prunus_avium.Satonishiki.PAV_r1.0.GDR.dna_rm.genome.coge | \ perl -lne 'if(/^(>.*)/){ $head=$1 } else { $fa{$head} .= $_ } END{ foreach $s (sort(keys(%fa))){ print "$s\n$fa{$s}\n" }}' The rest of files were produced by RSAT-Tools to install this genome.