.. DO NOT EDIT. .. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY. .. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE: .. "examples/suvi/suvi_download_L1b.py" .. LINE NUMBERS ARE GIVEN BELOW. .. only:: html .. note:: :class: sphx-glr-download-link-note :ref:`Go to the end ` to download the full example code. .. rst-class:: sphx-glr-example-title .. _sphx_glr_examples_suvi_suvi_download_L1b.py: Download SUVI L1b files ======================= Purpose: Download SUVI L1b files from the NOAA webserver. .. GENERATED FROM PYTHON SOURCE LINES 9-12 First of all: if you have wget and want an easy solution outside of Python, here are a few bash one-liner examples (remove the #) that can be applied to GOES-16, GOES-17, and different wavelengths with minor changes: .. GENERATED FROM PYTHON SOURCE LINES 14-15 Download an entire day of 171 data to the current directory (long and short exposures): .. GENERATED FROM PYTHON SOURCE LINES 15-18 .. code-block:: Python #wget -nH -nd -r -np -A *.fits.gz https://data.ngdc.noaa.gov/platforms/solar-space-observing-satellites/goes/goes16/l1b/suvi-l1b-fe171/2021/02/18/ .. GENERATED FROM PYTHON SOURCE LINES 19-20 Download only the 171 data between 1 and 2 pm that day to the current directory: .. GENERATED FROM PYTHON SOURCE LINES 20-23 .. code-block:: Python #wget -nH -nd -r -np -A OR_SUVI-L1b-Fe171_G16_s2021???13*.fits.gz https://data.ngdc.noaa.gov/platforms/solar-space-observing-satellites/goes/goes16/l1b/suvi-l1b-fe171/2021/02/18/ .. GENERATED FROM PYTHON SOURCE LINES 24-25 Same as above, but for all SUVI wavelengths, downloaded into their respective subdirectories: .. GENERATED FROM PYTHON SOURCE LINES 25-28 .. code-block:: Python #for w in fe094 fe131 fe171 fe195 fe284 he304; do wget -nH -nd -r -np --directory-prefix=$w -A OR_SUVI-L1b-?e???_G16_s2021???13*.fits.gz https://data.ngdc.noaa.gov/platforms/solar-space-observing-satellites/goes/goes16/l1b/suvi-l1b-$w/2021/02/18/; done; .. GENERATED FROM PYTHON SOURCE LINES 29-30 Now let's use Python. Import the necessary libraries: .. GENERATED FROM PYTHON SOURCE LINES 30-37 .. code-block:: Python __author__ = "cbethge" from bs4 import BeautifulSoup from astropy.time import Time, TimeDelta import requests, os import numpy as np .. GENERATED FROM PYTHON SOURCE LINES 38-39 Define parser for the SUVI websites using BeautifulSoup: .. GENERATED FROM PYTHON SOURCE LINES 39-44 .. code-block:: Python def list_url_directory(url, ext=''): page = requests.get(url).text soup = BeautifulSoup(page, 'html.parser') return [url + node.get('href') for node in soup.find_all('a') if node.get('href').endswith(ext)] .. GENERATED FROM PYTHON SOURCE LINES 45-65 Now we define the variable date_time. The general template for is 'YYYY-MM-DDThh:mm:ss', but it can have the following formats: Single date/time: ``'2020-01-05T12:30:00'`` Several dates/times: ``'2020-01-05T12:30:00, 2020-04-23T11:43:00, 2020-05-11T17:05:00'`` JSOC-style with start time, timespan, and cadence (image every 20 min for 1 hour in this example): ``'2020-01-05T12:30:00/1h@20m'`` If a single or several explicit date_times are given, the code will only download the data closest to those timestamps. For the JSOC-style, it will download everything in the given range with the given cadence. Note that SUVI has an imaging cadence of 4 minutes, so any given cadence should be a multiple of 4 minutes. An exception is the 195 channel, where images are taken more frequently. Accepted units for the timespan and cadence are: 'd' (days), 'h' (hours), and 'm' (minutes). .. GENERATED FROM PYTHON SOURCE LINES 65-67 .. code-block:: Python date_time = '2020-01-05T12:30:00/1h@20m' .. GENERATED FROM PYTHON SOURCE LINES 68-69 A few other definitions: .. GENERATED FROM PYTHON SOURCE LINES 69-80 .. code-block:: Python spacecraft = 16 # GOES 16 or 17? wavelengths = [171,195] # Wavelengths. Valid values: 93, 94, 131, 171, 195, 284, 304, 305. outdir = './L1b' # The download path. Subdirectories for the wavelengths will be created. query_only = False # If True, then the filenames are printed only, nothing is downloaded. verbose = True # If True, then print the filenames when downloading. long_exp = True # Download long exposures? short_exp = False # Download short exposures? Note that it will only download one of the two # short exposures for 94 and 131, the one that is closer to the given time. # So depending if you want the 'short exposure' or the 'short flare exposure' # in those channels, it might take a bit of fiddling with the chosen start time. .. GENERATED FROM PYTHON SOURCE LINES 81-82 Run the code: .. GENERATED FROM PYTHON SOURCE LINES 82-239 .. code-block:: Python for wavelength in wavelengths: # Split the date argument at the commas (if applicable) date_time = date_time.replace(" ","").split(',') if len(date_time) == 1: # If it is not several dates, take only the first item. That way, # we can distinguish between lists and strings below. date_time = date_time[0] # this should stay the same for now baseurl1 = 'https://data.ngdc.noaa.gov/platforms/solar-space-observing-satellites/goes/goes' baseurl2 = '/l1b/' ext = '.fits.gz' # check for existing output directory and correct spacecraft and wavelength numbers if not query_only: # Create the output directory if it does not exist try: os.makedirs(outdir) except FileExistsError: # directory already exists pass spacecraft_numbers = [16,17] if spacecraft not in spacecraft_numbers: raise Exception('Invalid spacecraft number: '+str(spacecraft)+'. Valid values are: 16, 17.') wvln_path = dict({ 93:'suvi-l1b-fe094', 94:'suvi-l1b-fe094', 131:'suvi-l1b-fe131', 171:'suvi-l1b-fe171', \ 195:'suvi-l1b-fe195', 284:'suvi-l1b-fe284', 304:'suvi-l1b-he304', 305:'suvi-l1b-he304' }) if wavelength not in wvln_path: raise Exception('Invalid wavelength: '+str(wavelength)+'. Valid values are: 93, 94, 131, 171, 195, 284, 304, 305.') # Figure out what kind of date_time was given. if isinstance(date_time, str): # Check if it is a JSOC-style query if len(date_time.split('/')) == 2: if len(date_time.split('@')) == 2: cadence_string = date_time.split('@')[1] timespan_string = date_time.split('@')[0].split('/')[1] cadence = float(cadence_string[:-1]) cadence_unit = cadence_string[-1] if cadence_unit == 'm': cadence = cadence*60. elif cadence_unit == 'h': cadence = cadence*60.*60. elif cadence_unit == 'd': cadence = cadence*60.*60*24. else: raise Exception('Not a valid time unit (must be m, h, or d).') else: cadence = 240. timespan_string = date_time.split('/')[1] timespan = float(timespan_string[:-1]) timespan_unit = timespan_string[-1] if timespan_unit == 'm': timespan = timespan*60. elif timespan_unit == 'h': timespan = timespan*60.*60. elif timespan_unit == 'd': timespan = timespan*60.*60*24. else: raise Exception('Not a valid time unit (must be m, h, or d).') t0 = Time(date_time.split('/')[0], scale='utc', format='isot') tmp_timestamp = [] counter = 0 while counter*cadence <= timespan: tmp_timestamp.append(counter*cadence) counter += 1 timestamp = t0+TimeDelta(tmp_timestamp, format='sec') urls = [] for time in timestamp: urls.append(baseurl1+str(spacecraft)+baseurl2+wvln_path[wavelength]+'/'+time.value[0:10].replace('-','/')+'/') else: # Only one date, and no JSOC-style query timestamp = [Time(date_time, scale='utc', format='isot')] urls = [baseurl1+str(spacecraft)+baseurl2+wvln_path[wavelength]+'/'+date_time[0:10].replace('-','/')+'/'] elif isinstance(date_time, list): # if the argument was a list of dates timestamp = [] urls = [] for this_date in date_time: timestamp.append(Time(this_date, scale='utc', format='isot')) urls.append(baseurl1+str(spacecraft)+baseurl2+wvln_path[wavelength]+'/'+this_date[0:10].replace('-','/')+'/') # Before we run, check if all of the websites are there. # Cook the urls down to unique values. To do that, convert # to a numpy array, use np.unique, and then convert back # to a list. Tried by using conversion to a set first, # but that doesn't keep the correct order for the dates. urls_arr = np.array(urls) urls_unique = np.unique(urls_arr).tolist() all_files = [] start_time = [] end_time = [] for url in urls_unique: request = requests.get(url) if not request.status_code == 200: raise Exception('Website not found: '+url) else: # If all of the websites were found, go ahead and make lists of files and dates. print('Querying', url, 'for SUVI files...') for file in list_url_directory(url, ext): all_files.append(file) file_base = os.path.basename(file) start_time.append(url[-11:-1].replace('/','-')+'T'+file_base[30:32]+':'+file_base[32:34]+':'+\ file_base[34:36]+'.'+file_base[36]+'00') end_time.append(url[-11:-1].replace('/','-')+'T'+file_base[46:48]+':'+file_base[48:50]+':'+\ file_base[50:52]+'.'+file_base[52]+'00') # Create the subdirectory for the current wavelength this_outdir = os.path.join(outdir, str(wavelength)) try: os.makedirs(this_outdir) except FileExistsError: # directory already exists pass # Make astropy time objects from the start and end times, compute the exposure time from that. start_time = Time(start_time, scale='utc', format='isot') end_time = Time(end_time, scale='utc', format='isot') exposure_time = end_time-start_time # Sort in long and short exposures. long_exposures = np.where(np.around(exposure_time.sec) == 1) short_exposures = np.where(np.around(exposure_time.sec) == 0) long_exposure_files = np.array(all_files)[long_exposures] short_exposure_files = np.array(all_files)[short_exposures] # Now go through all of the requested times and download/print the files. for time in timestamp: if long_exp: delta_t = time-start_time[long_exposures] which_file = np.abs(delta_t).argmin() if query_only: print('Long exposure: ', long_exposure_files[which_file]) else: if verbose: print('Long exposure: ', long_exposure_files[which_file]) f = requests.get(long_exposure_files[which_file]) open(os.path.join(this_outdir, os.path.basename(long_exposure_files[which_file])), 'wb').write(f.content) if short_exp: delta_t = time-start_time[short_exposures] which_file = np.abs(delta_t).argmin() if query_only: print('Short exposure:', short_exposure_files[which_file]) else: if verbose: print('Short exposure:', short_exposure_files[which_file]) f = requests.get(short_exposure_files[which_file]) open(os.path.join(this_outdir, os.path.basename(short_exposure_files[which_file])), 'wb').write(f.content) .. rst-class:: sphx-glr-script-out .. code-block:: none Querying https://data.ngdc.noaa.gov/platforms/solar-space-observing-satellites/goes/goes16/l1b/suvi-l1b-fe171/2020/01/05/ for SUVI files... Long exposure: https://data.ngdc.noaa.gov/platforms/solar-space-observing-satellites/goes/goes16/l1b/suvi-l1b-fe171/2020/01/05/OR_SUVI-L1b-Fe171_G16_s20200051230404_e20200051230414_c20200051231078.fits.gz Long exposure: https://data.ngdc.noaa.gov/platforms/solar-space-observing-satellites/goes/goes16/l1b/suvi-l1b-fe171/2020/01/05/OR_SUVI-L1b-Fe171_G16_s20200051250405_e20200051250415_c20200051251081.fits.gz Long exposure: https://data.ngdc.noaa.gov/platforms/solar-space-observing-satellites/goes/goes16/l1b/suvi-l1b-fe171/2020/01/05/OR_SUVI-L1b-Fe171_G16_s20200051310405_e20200051310415_c20200051311080.fits.gz Long exposure: https://data.ngdc.noaa.gov/platforms/solar-space-observing-satellites/goes/goes16/l1b/suvi-l1b-fe171/2020/01/05/OR_SUVI-L1b-Fe171_G16_s20200051330406_e20200051330416_c20200051331076.fits.gz Querying https://data.ngdc.noaa.gov/platforms/solar-space-observing-satellites/goes/goes16/l1b/suvi-l1b-fe195/2020/01/05/ for SUVI files... Long exposure: https://data.ngdc.noaa.gov/platforms/solar-space-observing-satellites/goes/goes16/l1b/suvi-l1b-fe195/2020/01/05/OR_SUVI-L1b-Fe195_G16_s20200051229504_e20200051229514_c20200051230173.fits.gz Long exposure: https://data.ngdc.noaa.gov/platforms/solar-space-observing-satellites/goes/goes16/l1b/suvi-l1b-fe195/2020/01/05/OR_SUVI-L1b-Fe195_G16_s20200051249505_e20200051249515_c20200051250177.fits.gz Long exposure: https://data.ngdc.noaa.gov/platforms/solar-space-observing-satellites/goes/goes16/l1b/suvi-l1b-fe195/2020/01/05/OR_SUVI-L1b-Fe195_G16_s20200051309505_e20200051309515_c20200051310180.fits.gz Long exposure: https://data.ngdc.noaa.gov/platforms/solar-space-observing-satellites/goes/goes16/l1b/suvi-l1b-fe195/2020/01/05/OR_SUVI-L1b-Fe195_G16_s20200051329506_e20200051329516_c20200051330197.fits.gz .. rst-class:: sphx-glr-timing **Total running time of the script:** (0 minutes 13.226 seconds) .. _sphx_glr_download_examples_suvi_suvi_download_L1b.py: .. only:: html .. container:: sphx-glr-footer sphx-glr-footer-example .. container:: sphx-glr-download sphx-glr-download-jupyter :download:`Download Jupyter notebook: suvi_download_L1b.ipynb ` .. container:: sphx-glr-download sphx-glr-download-python :download:`Download Python source code: suvi_download_L1b.py ` .. container:: sphx-glr-download sphx-glr-download-zip :download:`Download zipped: suvi_download_L1b.zip ` .. only:: html .. rst-class:: sphx-glr-signature `Gallery generated by Sphinx-Gallery `_