The time it takes for WRF to write, e.g. history stream 0, to output, extremely depends on whether the data is written into one NetCDF file or to one NetCDF file per processor. In my case, when using 1024 processors, the time to write one single output file takes about 120 seconds (for every time step!). Using the io_form_history = 102 option, every processors writes into its "own" NetCDF file, which in my case takes only about 3 seconds!
This splitting option is also very useful for restart files (io_form_restart = 102), as one should restart the model with the same number of processors anyway.
The problem is now, that one has to join/merge/combine/add all single files from every processor together to one file in order to use the data properly. I use joinwrfh from ARPS (version 5.3.3), which you can download and use for free. It is possible to compile only the program joinwrfh by doing:
makearps -rd /path/to/arps5.3.3 -io net joinwrfh
In its original design joinwrfh processes one splitted NetCDF file (called patch) after the other. For every patch all variables, which are saved in this patch, are written at there corresponding position of the final NetCDF file. So there is a loop over all patches, which have to be combined, and inside this a loop over all variables. This implicates a lot of writing processes to the final file: every variable in every patch.
I rearranged some loops in the code in order to make joinwrfh much much faster. Instead of more than 20 hours, it now takes between 10 and 20 minutes (depending on the domain size, number of splitted files and number (and dimension) of the variables). In my version of joinwrfh, there is a loop over all variables contained in the WRF output and inside this loop there is a loop over all patches (splitted files). The memory needed for one joined variable is allocated right at the beginning of every variable-loop (e.g. hold in memory). The program then reads this variable from every patch and can quickly write this data to the allocated joined variable (in memory), which is at the end of the patch-loop written to the final wrfout_d01* NetCDF file. This procedure is much faster since all splitted parts of one variable are written to memory or in other words this variable is set together in memory and is only once written to the final NetCDF file. Writing to a NetCDF file takes much more time, than saving something in the memory.
Modified routine in /arps5.3.3/src/wrf2arps/:
Replace it in your original arps version (path above) and remember to make a copy of the original fjoinwrf.f90 file. Of course you need to compile joinwrfh again.
It is not perfect, but it is faster and it works correctly. All meta data are conserved and all patches are at the correct position.
Please notice: From ARPS
The Advanced Regional Prediction System (ARPS), including its pre- and post-processing and data assimilation packages, resides in the public domain and may be used free of charge, and without restriction, including for-profit activities by private industry.
My changes are provided "as is", without warranty of any kind.