.. DO NOT EDIT.
.. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY.
.. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE:
.. "examples/suvi/suvi_download_L1b.py"
.. LINE NUMBERS ARE GIVEN BELOW.

.. only:: html

    .. note::
        :class: sphx-glr-download-link-note

        :ref:`Go to the end <sphx_glr_download_examples_suvi_suvi_download_L1b.py>`
        to download the full example code.

.. rst-class:: sphx-glr-example-title

.. _sphx_glr_examples_suvi_suvi_download_L1b.py:


Download SUVI L1b files
=======================
Purpose:
   Download SUVI L1b files from the NOAA webserver.

.. GENERATED FROM PYTHON SOURCE LINES 9-12

First of all: if you have wget and want an easy solution outside of Python,
here are a few bash one-liner examples (remove the #) that can be applied to
GOES-16, GOES-17, and different wavelengths with minor changes:

.. GENERATED FROM PYTHON SOURCE LINES 14-15

Download an entire day of 171 data to the current directory (long and short exposures):

.. GENERATED FROM PYTHON SOURCE LINES 15-18

.. code-block:: Python


    #wget -nH -nd -r -np -A *.fits.gz https://data.ngdc.noaa.gov/platforms/solar-space-observing-satellites/goes/goes16/l1b/suvi-l1b-fe171/2021/02/18/


.. GENERATED FROM PYTHON SOURCE LINES 19-20

Download only the 171 data between 1 and 2 pm that day to the current directory:

.. GENERATED FROM PYTHON SOURCE LINES 20-23

.. code-block:: Python


    #wget -nH -nd -r -np -A OR_SUVI-L1b-Fe171_G16_s2021???13*.fits.gz https://data.ngdc.noaa.gov/platforms/solar-space-observing-satellites/goes/goes16/l1b/suvi-l1b-fe171/2021/02/18/


.. GENERATED FROM PYTHON SOURCE LINES 24-25

Same as above, but for all SUVI wavelengths, downloaded into their respective subdirectories:

.. GENERATED FROM PYTHON SOURCE LINES 25-28

.. code-block:: Python


    #for w in fe094 fe131 fe171 fe195 fe284 he304; do wget -nH -nd -r -np --directory-prefix=$w -A OR_SUVI-L1b-?e???_G16_s2021???13*.fits.gz https://data.ngdc.noaa.gov/platforms/solar-space-observing-satellites/goes/goes16/l1b/suvi-l1b-$w/2021/02/18/; done;


.. GENERATED FROM PYTHON SOURCE LINES 29-30

Now let's use Python. Import the necessary libraries:

.. GENERATED FROM PYTHON SOURCE LINES 30-37

.. code-block:: Python

    __author__ = "cbethge"

    from bs4 import BeautifulSoup
    from astropy.time import Time, TimeDelta
    import requests, os
    import numpy as np


.. GENERATED FROM PYTHON SOURCE LINES 38-39

Define parser for the SUVI websites using BeautifulSoup:

.. GENERATED FROM PYTHON SOURCE LINES 39-44

.. code-block:: Python

    def list_url_directory(url, ext=''):
        page = requests.get(url).text
        soup = BeautifulSoup(page, 'html.parser')
        return [url + node.get('href') for node in soup.find_all('a') if node.get('href').endswith(ext)]
        

.. GENERATED FROM PYTHON SOURCE LINES 45-65

Now we define the variable date_time. The general template for is 
'YYYY-MM-DDThh:mm:ss', but it can have the following formats:

 Single date/time:
    ``'2020-01-05T12:30:00'``

 Several dates/times:
    ``'2020-01-05T12:30:00, 2020-04-23T11:43:00, 2020-05-11T17:05:00'``

 JSOC-style with start time, timespan, and cadence (image every 20 min for 1 hour in this example):
    ``'2020-01-05T12:30:00/1h@20m'``

If a single or several explicit date_times are given, the code will only
download the data closest to those timestamps.
For the JSOC-style, it will download everything in the given range with the
given cadence. Note that SUVI has an imaging cadence of 4 minutes, so any 
given cadence should be a multiple of 4 minutes. An exception is the 195
channel, where images are taken more frequently. Accepted units for the
timespan and cadence are:
'd' (days), 'h' (hours), and 'm' (minutes).

.. GENERATED FROM PYTHON SOURCE LINES 65-67

.. code-block:: Python

    date_time = '2020-01-05T12:30:00/1h@20m'


.. GENERATED FROM PYTHON SOURCE LINES 68-69

A few other definitions:

.. GENERATED FROM PYTHON SOURCE LINES 69-80

.. code-block:: Python

    spacecraft  = 16         # GOES 16 or 17?
    wavelengths = [171,195]  # Wavelengths. Valid values: 93, 94, 131, 171, 195, 284, 304, 305.
    outdir      = './L1b'    # The download path. Subdirectories for the wavelengths will be created.
    query_only  = False      # If True, then the filenames are printed only, nothing is downloaded.
    verbose     = True       # If True, then print the filenames when downloading.
    long_exp    = True       # Download long exposures?
    short_exp   = False      # Download short exposures? Note that it will only download one of the two
                             # short exposures for 94 and 131, the one that is closer to the given time.
                             # So depending if you want the 'short exposure' or the 'short flare exposure'
                             # in those channels, it might take a bit of fiddling with the chosen start time.


.. GENERATED FROM PYTHON SOURCE LINES 81-82

Run the code:

.. GENERATED FROM PYTHON SOURCE LINES 82-239

.. code-block:: Python

    for wavelength in wavelengths:
        # Split the date argument at the commas (if applicable)    
        date_time = date_time.replace(" ","").split(',')
        if len(date_time) == 1:
            # If it is not several dates, take only the first item. That way,
            # we can distinguish between lists and strings below.
            date_time = date_time[0]
    
        # this should stay the same for now
        baseurl1 = 'https://data.ngdc.noaa.gov/platforms/solar-space-observing-satellites/goes/goes'
        baseurl2 = '/l1b/'
        ext = '.fits.gz'
    
        # check for existing output directory and correct spacecraft and wavelength numbers
        if not query_only:
            # Create the output directory if it does not exist
            try:
                os.makedirs(outdir)
            except FileExistsError:
                # directory already exists
                pass
    
        spacecraft_numbers = [16,17]
    
        if spacecraft not in spacecraft_numbers:
            raise Exception('Invalid spacecraft number: '+str(spacecraft)+'. Valid values are: 16, 17.')
    
        wvln_path = dict({ 93:'suvi-l1b-fe094',  94:'suvi-l1b-fe094', 131:'suvi-l1b-fe131', 171:'suvi-l1b-fe171', \
                          195:'suvi-l1b-fe195', 284:'suvi-l1b-fe284', 304:'suvi-l1b-he304', 305:'suvi-l1b-he304' })
    
        if wavelength not in wvln_path:
            raise Exception('Invalid wavelength: '+str(wavelength)+'. Valid values are: 93, 94, 131, 171, 195, 284, 304, 305.')
    
        # Figure out what kind of date_time was given.
        if isinstance(date_time, str):
            # Check if it is a JSOC-style query
            if len(date_time.split('/')) == 2:
                if len(date_time.split('@')) == 2:
                    cadence_string = date_time.split('@')[1]
                    timespan_string = date_time.split('@')[0].split('/')[1]
                    cadence = float(cadence_string[:-1])
                    cadence_unit = cadence_string[-1]
                    if cadence_unit == 'm':
                        cadence = cadence*60.
                    elif cadence_unit == 'h':
                        cadence = cadence*60.*60.
                    elif cadence_unit == 'd':
                        cadence = cadence*60.*60*24.
                    else:
                        raise Exception('Not a valid time unit (must be m, h, or d).') 
                else:
                    cadence = 240.
                    timespan_string = date_time.split('/')[1]
        
                timespan = float(timespan_string[:-1])
                timespan_unit = timespan_string[-1]
                if timespan_unit == 'm':
                    timespan = timespan*60.
                elif timespan_unit == 'h':
                    timespan = timespan*60.*60.
                elif timespan_unit == 'd':
                    timespan = timespan*60.*60*24.
                else:
                    raise Exception('Not a valid time unit (must be m, h, or d).')    
    
                t0 = Time(date_time.split('/')[0], scale='utc', format='isot')
                tmp_timestamp = []
                counter = 0
                while counter*cadence <= timespan:
                    tmp_timestamp.append(counter*cadence)
                    counter += 1
                
                timestamp = t0+TimeDelta(tmp_timestamp, format='sec')
                urls = []
                for time in timestamp:
                    urls.append(baseurl1+str(spacecraft)+baseurl2+wvln_path[wavelength]+'/'+time.value[0:10].replace('-','/')+'/')
    
            else:
                # Only one date, and no JSOC-style query
                timestamp = [Time(date_time, scale='utc', format='isot')]
                urls = [baseurl1+str(spacecraft)+baseurl2+wvln_path[wavelength]+'/'+date_time[0:10].replace('-','/')+'/']
            
        elif isinstance(date_time, list):
            # if the argument was a list of dates
            timestamp = []
            urls = []
            for this_date in date_time:
                timestamp.append(Time(this_date, scale='utc', format='isot'))
                urls.append(baseurl1+str(spacecraft)+baseurl2+wvln_path[wavelength]+'/'+this_date[0:10].replace('-','/')+'/')
    
    
        # Before we run, check if all of the websites are there.
        # Cook the urls down to unique values. To do that, convert
        # to a numpy array, use np.unique, and then convert back
        # to a list. Tried by using conversion to a set first,
        # but that doesn't keep the correct order for the dates.
        urls_arr = np.array(urls) 
        urls_unique = np.unique(urls_arr).tolist() 
        all_files  = []
        start_time = []
        end_time   = []
        for url in urls_unique:    
            request = requests.get(url)
            if not request.status_code == 200:
                raise Exception('Website not found: '+url)
            else:
                # If all of the websites were found, go ahead and make lists of files and dates.
                print('Querying', url, 'for SUVI files...')
                for file in list_url_directory(url, ext):
                    all_files.append(file)
                    file_base = os.path.basename(file)
                    start_time.append(url[-11:-1].replace('/','-')+'T'+file_base[30:32]+':'+file_base[32:34]+':'+\
                                          file_base[34:36]+'.'+file_base[36]+'00')
                    end_time.append(url[-11:-1].replace('/','-')+'T'+file_base[46:48]+':'+file_base[48:50]+':'+\
                                        file_base[50:52]+'.'+file_base[52]+'00')
    
        # Create the subdirectory for the current wavelength
        this_outdir = os.path.join(outdir, str(wavelength))
        try:
            os.makedirs(this_outdir)
        except FileExistsError:
            # directory already exists
            pass
                
        # Make astropy time objects from the start and end times, compute the exposure time from that.
        start_time = Time(start_time, scale='utc', format='isot')
        end_time = Time(end_time, scale='utc', format='isot')
        exposure_time = end_time-start_time
    
        # Sort in long and short exposures.
        long_exposures  = np.where(np.around(exposure_time.sec) == 1)
        short_exposures = np.where(np.around(exposure_time.sec) == 0)
        long_exposure_files  = np.array(all_files)[long_exposures]
        short_exposure_files = np.array(all_files)[short_exposures]
    
        # Now go through all of the requested times and download/print the files.
        for time in timestamp:
            if long_exp:
                delta_t = time-start_time[long_exposures]
                which_file = np.abs(delta_t).argmin()
                if query_only:
                    print('Long exposure: ', long_exposure_files[which_file])
                else:
                    if verbose:
                        print('Long exposure: ', long_exposure_files[which_file])
                    f = requests.get(long_exposure_files[which_file])
                    open(os.path.join(this_outdir, os.path.basename(long_exposure_files[which_file])), 'wb').write(f.content)
            if short_exp:
                delta_t = time-start_time[short_exposures]
                which_file = np.abs(delta_t).argmin()
                if query_only:
                    print('Short exposure:', short_exposure_files[which_file])
                else:
                    if verbose:
                        print('Short exposure:', short_exposure_files[which_file])
                    f = requests.get(short_exposure_files[which_file])
                    open(os.path.join(this_outdir, os.path.basename(short_exposure_files[which_file])), 'wb').write(f.content)


.. rst-class:: sphx-glr-script-out

 .. code-block:: none

    Querying https://data.ngdc.noaa.gov/platforms/solar-space-observing-satellites/goes/goes16/l1b/suvi-l1b-fe171/2020/01/05/ for SUVI files...
    Long exposure:  https://data.ngdc.noaa.gov/platforms/solar-space-observing-satellites/goes/goes16/l1b/suvi-l1b-fe171/2020/01/05/OR_SUVI-L1b-Fe171_G16_s20200051230404_e20200051230414_c20200051231078.fits.gz
    Long exposure:  https://data.ngdc.noaa.gov/platforms/solar-space-observing-satellites/goes/goes16/l1b/suvi-l1b-fe171/2020/01/05/OR_SUVI-L1b-Fe171_G16_s20200051250405_e20200051250415_c20200051251081.fits.gz
    Long exposure:  https://data.ngdc.noaa.gov/platforms/solar-space-observing-satellites/goes/goes16/l1b/suvi-l1b-fe171/2020/01/05/OR_SUVI-L1b-Fe171_G16_s20200051310405_e20200051310415_c20200051311080.fits.gz
    Long exposure:  https://data.ngdc.noaa.gov/platforms/solar-space-observing-satellites/goes/goes16/l1b/suvi-l1b-fe171/2020/01/05/OR_SUVI-L1b-Fe171_G16_s20200051330406_e20200051330416_c20200051331076.fits.gz
    Querying https://data.ngdc.noaa.gov/platforms/solar-space-observing-satellites/goes/goes16/l1b/suvi-l1b-fe195/2020/01/05/ for SUVI files...
    Long exposure:  https://data.ngdc.noaa.gov/platforms/solar-space-observing-satellites/goes/goes16/l1b/suvi-l1b-fe195/2020/01/05/OR_SUVI-L1b-Fe195_G16_s20200051229504_e20200051229514_c20200051230173.fits.gz
    Long exposure:  https://data.ngdc.noaa.gov/platforms/solar-space-observing-satellites/goes/goes16/l1b/suvi-l1b-fe195/2020/01/05/OR_SUVI-L1b-Fe195_G16_s20200051249505_e20200051249515_c20200051250177.fits.gz
    Long exposure:  https://data.ngdc.noaa.gov/platforms/solar-space-observing-satellites/goes/goes16/l1b/suvi-l1b-fe195/2020/01/05/OR_SUVI-L1b-Fe195_G16_s20200051309505_e20200051309515_c20200051310180.fits.gz
    Long exposure:  https://data.ngdc.noaa.gov/platforms/solar-space-observing-satellites/goes/goes16/l1b/suvi-l1b-fe195/2020/01/05/OR_SUVI-L1b-Fe195_G16_s20200051329506_e20200051329516_c20200051330197.fits.gz


.. rst-class:: sphx-glr-timing

   **Total running time of the script:** (0 minutes 13.226 seconds)


.. _sphx_glr_download_examples_suvi_suvi_download_L1b.py:

.. only:: html

  .. container:: sphx-glr-footer sphx-glr-footer-example

    .. container:: sphx-glr-download sphx-glr-download-jupyter

      :download:`Download Jupyter notebook: suvi_download_L1b.ipynb <suvi_download_L1b.ipynb>`

    .. container:: sphx-glr-download sphx-glr-download-python

      :download:`Download Python source code: suvi_download_L1b.py <suvi_download_L1b.py>`

    .. container:: sphx-glr-download sphx-glr-download-zip

      :download:`Download zipped: suvi_download_L1b.zip <suvi_download_L1b.zip>`


.. only:: html

 .. rst-class:: sphx-glr-signature

    `Gallery generated by Sphinx-Gallery <https://sphinx-gallery.github.io>`_