L01 pipeline - Title page and Contents

1) Space Science & Technology Department, Rutherford Appleton Laboratory, Chilton, Didcot, Oxon. OX11 0QX, UK
2) ESA Satellite Tracking Station, Villafranca del Castillo, Madrid, Spain

Acknowledgements

The authors would like to thank Jean-Paul Baluteau (LAM), Emmanuel Caux (CESR), Cécile Gry (IDC-ESA) and Tanya Lim (RAL) amongst others for their guidance and encouragement during the production of the L01 pipeline.

Documentation

With the release of the Infrared Space Observatory legacy archive, Long Wavelength Spectrometer (LWS) observations are available to the scientific community. In addition to the basic data the ISO project is keen to release more highly processed data sets that have been reduced by groups familiar with the data. These provide resources that are of more immediate use to the community, and in particular new users.

The LWS has four modes of operation yielding intermediate and high-resolution spectra and the first of these, L01, covers the full wavelength range of the instrument 43 - 197 microns at intermediate resolution. The process described here is an automatic cleaning and averaging of the LWS L01 data and has been developed in consultation with members of the LWS consortium and the LWS expert at Vilspa, specifically as a Highly Processed Data Product (HPDP) data set for the archive.

The pipeline uses as input the final flux-calibrated LSAN files generated by the off-line processing OLP v10.1. These are the usual starting point for any analysis of the LWS spectra. An interactive package, the ISO Spectroscopic Analysis Package (ISAP), is available to reduce LWS spectra, and while this contains many routines to visualise and manipulate the data, it is unable to use particular features of the LWS data in the reduction that the pipeline is designed to use.

The output products of the pipeline are an LSAN structure that can be read into ISAP and an ASCII version of the data that can be read into IDL using a routine that is provided. During the processing the pipeline also generates a number of flags that provide information and warnings about the spectra.

The following sections contain details of the features and problems with LWS data, some important comments about the input data, a brief description of the pipeline process, details of the flags and a description of the output products.

LWS full grating mode (L01) observations cover the wavelength range 43 - 197 microns using five short-wavelength (SW1 - SW5) and five long-wavelength (LW1 - LW5) photoconductive detectors. The wavelength ranges and detector types are given in Table 1. The ranges used for the pipeline are identical to the nominal wavelength ranges given in the Handbook T2.3. The long wavelength range is observed in the first order of the grating while the short wavelength range is observed in the second order, leading to a factor of ~2 difference in spectral resolution.

**Table 1:** *LWS detectors.*
Detector	Range	Type
SW1	43-50.5	Ge:Be
SW2	49.5-64	Ge:Ga(u)
SW3	57-70	Ge:Ga(u)
SW4	67-82	Ge:Ga(u)
SW5	76-93	Ge:Ga(u)
LW1	84-110	Ge:Ga(u)
LW2	103-128	Ge:Ga(s)
LW3	123-152	Ge:Ga(s)
LW4	142-171	Ge:Ga(s)
LW5	161-197	Ge:Ga(s)

The behaviour of the detectors strongly depends on their construction; they are all p-type semiconductors of germanium doped with beryllium or gallium, but the longer wavelength Ge:Ga detectors are put under mechanical stress to increase the wavelength at which they are sensitive. This produces three detector types, Ge:Be (SW1), unstressed Ge:Ga (SW2 - LW1) and stressed Ge:Ga (LW2 - LW5) and broadly three different sets of characteristics. The stressed detectors are mounted on a separate thermal block maintained at 1.8 K while the other detectors are kept a 3 K. A detailed description of the detectors in given in the Handbook S2.6 and the whole instrument by Clegg et al.,(1996).

The basic measurement made by the LWS detectors is a photocurrent, which corresponds to an incident flux. The photocurrent is derived by integrating the voltage produced by the detector over time and fitting the slope, known as the ramp. The process is described in the LWS Handbook S2.6.1.1. The vast majority of observation were made using half-second ramps (integrations) with a small number of quarter-second ramps, and one one-second ramp.

In L01 mode the grating is scanned forwards and backwards (usually) over its full range and the observations accumulate as a number of individual scans. An observations may consist of any number of scans in either the forward or reverse directions. The most common arrangement is a more or less equal number of scans in each direction, but some observations contain scans in one direction only. If an observation contains less than 4 scans it is flagged by the pipeline as having a low number of scans. Although the grating is usually scanned over its full range some observations cover only a limited range, sometimes to such a degree that there is no overlap between adjacent detectors. Normally one measurement is made at each wavelength point during a scan but this can be increased in slow scan observations.

Cosmic ray hits generate voltage glitches on the detectors that lead to spurious photocurrents. In the most extreme cases the sensitivity of the of the detector may be changed for some time. The rest of the scan may be lost and other scans may be affected. In less extreme cases the affected ramp and some following ramps may be lost, and with them the corresponding calibrated fluxes. Affected ramps are usually rejected but many ramps in the glitch tail and also ramps with smaller glitches do find their way into the flux calibrated spectrum. These are the major component of random noise in LWS spectra. See the LWS Handbook S2.6.2 S4.3.5

Observations can be made at a number of wavelength resolutions corresponding to 1, 2, 4 and 8 (called the oversampling rate) times the spectral resolution, which is about 0.29 and 0.6 microns in the short and long-wavelength detectors respectively. The standard value is 4 and strong emission line sources tend to be done at 8. When an oversampling rate of less than 4 has been used, these observations are flagged by the pipeline as having low sampling.

Under normal circumstances only one observation is made at each wavelength point during a scan but in slow scan observations this can be increased. In all the slow scan observations four measurements were made at each wavelength point per scan. These observations usually contain few scans but the pipeline treats each set of measurements as a separate scan. The pipeline also flags slow scan observations. Because slow scan observations are stepped more slowly through the spectrum the transient effects are much reduced.

Observations can be made in one of three tracking modes (called attitude types), pointed, raster and tracking. Pointed observations are made of a single fixed target. Raster observations are made of multiple fixed targets and are made either in a line or a rectangle. The raster position is given in terms of points and lines, e.g.. 2 3 or 1 32. Tracking observations are made as multiple pointings of a moving target, usually with a small number of scans at each position.

Each raster observation is processed separately and treated as a pointed observation by the pipeline, and generates a mean spectrum for each raster position. All rasters positions are held together in the same output file. Tracking observations are assumed to be of the same object and are treated as separate (groups of) scans of a pointed observation, and generate a single output spectrum.

Fringing is a spurious modulation of the spectrum due to interference between the incoming beam and its reflection from a mirror substrate. The problem is described in the Handbook, S2.3, S5.9, S6.2. An interactive routine is available within ISAP and LIA to empirically remove the fringes. The problem affects extended sources and point sources observed off axis, and although it is worst at longer wavelengths in some cases all detectors can be affected.

The pipeline makes a judgement about the significance of any fringing and applies the de-fringing routine if necessary. Observations with the most severe fringing are flagged.

Transient effects refer to problems caused by the variable time response of the detectors to changes in flux. In general all detectors respond very quickly to a reduction in flux but the upward response time depends on the construction of the detector, the initial flux level and the rate of change in flux. The problem is described in the Handbook S6.9. Transient effects show themselves in two ways, both with the same cause. The first is in the overall shape of the continuum which tends to be different in the forward and reverse scan directions. In this case the detector is following the shape of the filter profile and the spectrum tends to lean in the direction of the scan. In the second, strong emission lines also tend to lean in the direction of the scan as the detector struggles to respond to the rapid increase in flux.
The pipeline uses the difference between the forward and reverse scans to make a judgement about the extent of the transient problem, and flags the worst cases.

The response of the detectors is non-linear when presented with very strong sources, but instead of simply underestimating the flux, the detectors behave as though the bias voltage has been changed, which also affects the spectral response. For details see the Handbook S5.7. Part of the problem may be overcome by reprocessing the data as quarter-second ramps (see Handbook S5.8), and a number of strong sources were originally observed in this way. A correction is also under development (see Handbook S5.7.2. The pipeline uses the OLP data directly and makes no attempt to correct for saturation effects.

There is a variable degree of wavelength overlap between adjacent detectors and no additional effort is made in either the OLP or the L01 pipeline to force agreement. The detectors are calibrated independently and under normal circumstances the agreement in the overlap regions is consistent with the errors in the calibration and the S/N ratio of the data. However, if the source is extended or is observed off axis, particularly in one part of the field, then the spectrum will be fringed, which increases the disagreement, or in the worst cases the spectrum may be fractured (see Handbook S6.3) and the normal calibration breaks down. There are also other reasons why the agreement may be poor, i) warm-up features, ii) transient effects, iii) de-biasing effects, iv) poor background subtraction in weak sources, v) other spurious features.

A significant number of LSAN files (275) suffer from a scan numbering problem, which is due to a software problem in OLP v10.1. In these files there is usually a substantial difference in flux between the correct and incorrectly number scans, which leads to severely corrupted spectra. To see how these spectra are handled by the pipeline see 'The data'.

A small number of spectra suffer from contamination due to a NIR leak, which produces a broad emission feature in SW1-SW3 and LW1-LW3. The full feature in seen in SW2 and LW2 but the wings are visible at the ends of the adjacent detectors. To see how these spectra are handled by the pipeline see 'The data'.

Spurious emission features appear in the LWS spectra from the longest wavelength detectors (LW3, LW4 and LW5) in observations taken towards the end of the mission. It is likely that these are due to a warming of the stressed detector mount. The problem is described in the Handbook S6.8. No correction is attempted by the pipeline. For more details see 'The data'.

The standard OLP 10.1 L01 SW1 spectra often show a double-peaked structure that is believed to be due to spurious features in the Relative Spectral Response Function (RSRF) for that detector. To see how these spectra are handled by the pipeline see 'The data'.

The SW1 double-peaked feature is the most obvious of the spurious features introduced by the RSRF. Another is a sharp apparent absorption at 77 microns in SW5 but other smaller features are suspected.

The LWS data are processed automatically through an off-line processing pipeline (OLP last version v10.1) consisting of three stages: Derive-ERD extracts the observation-relevant data from the telemetry stream; Derive-SPD processes the raw detector readouts into photocurrent and removes glitches due to particle impacts; Auto-Analysis performs the astronomical calibration of the data to produce a spectrum in flux units versus wavelength units. These different stages allow limited reprocessing of partially reduced data sets where some parameters may be changed.

The end products are the Edited Raw Data (ERD), Standard Processed Data (SPD) and Auto-Analysis Results (AAR) files. The principal products of the AAR process are the 'LSAN' files, which contain the wavelengths and absolutely calibrated fluxes of each scan together with processing flags and errors, and are the usual starting point for any analysis of the spectrum. These are the files held in the LWS archive and are the data used as input to the pipeline, with two notable exceptions.

1. Scan numbering problem. A significant number of files suffer from a scan numbering problem, which is due to a software problem in OLP v10.1. All the affected spectra, numbering 275, have now been corrected by having the scan numbers correctly written in the SPD file. These were then run on the auto-analysis pipeline to produce corrected LSAN files that were used as input to the pipeline. A list of the affected files is given in Table 2. Note that these corrupted spectra are likely to remain permanently in the official archives and may only be available as another HPDP data set.

2. NIR leak. A small number of spectra suffer from contamination due to a NIR leak, which produces a broad emission feature in SW1-SW3 and LW1-LW3. The pipeline has been used to identify further spectra suffering from this problem, bringing the total number known to 44. The process for removing this feature is described in the Handbook S6.7 and this has been used prior to the data being processed by the pipeline. A list of the affected files is given in Table 3.

1. Warm-up features. The pipeline has been used to identify further spectra that contain spurious features due the detector warm up, bringing the total number known to 107. The problem is described in the Handbook S6.8 where an initial list of affected TDTs is given. There is no correction that can be applied to these spectra, and nothing has been attempted by the pipeline, so they must be used with caution. A list of the affected files is given in Table 4.

2. SW1 double-peaked feature. The standard OLP 10.1 L01 SW1 spectra often show a double-peaked structure that is believed to be due to spurious features in the RSRF for that detector. A correction has been derived for the SW1 RSRF which has been applied to all the final, mean SW1 spectra. Further details are given in 'The SW1 double-peaked correction'.

The aim of the pipeline is to reject bad data and produce a mean spectrum. The main problems to overcome in doing this are the systematic differences between the scan directions and glitches in the data. Detector transient effects tend to produce systematic differences between the scan directions, and long-term effects can also lead to systematic differences between scans in the same direction. While the overall shape of individual scans may vary the short-term structure is much more consistent. The largest contributions to random noise are the tails of large glitches that have not been fully removed, and also much smaller glitches that have gone unrecognised. The pipeline uses both the consistency along the scan and across the scan, at the same wavelength, to reject spurious, mostly glitched, data. The consistency of the scans in the same direction is used to provide a reliable mean, and residual of individual scans are examined for discordant points. The systematic difference of each scan from the mean of that scan direction is then subtracted and the points at each wavelength are sigma clipped to reject any remaining discordant data. The pipeline processes the forward and reverse scans separately, and only merging them at the end.

1. Wavelength assignment. Each observed wavelength is correctly assigned to the appropriate wavelength bin.
2. Drift correction. Any remaining drift of flux with time is corrected separately for both scan directions.
3. First pass. Discordant residuals from each scan, relative to the mean spectrum of all scans, are rejected.
4. Second pass. The systematic difference of each scan from the median spectrum for each scan direction is removed and any discordant residuals are rejected.
5. Median spectrum. The median spectrum for each scan direction is calculated and any discordant residuals are rejected. Using the median minimises the effect of any remaining bad data.
6. De-fringing. The median spectrum for each scan direction is automatically de-fringed if necessary. Both scan directions must be above the threshold for them to be de-fringed
7. Final spectrum. The mean spectra from the forward and reverse scans are merged to give a final mean spectrum. The final spectrum is the average of the mean forward and reverse scans where any data missing from one scan only has been interpolated over. At this stage a correction is applied for the double peaked structure in detector SW1 that is believed to be due to spurious features in the RSRF for that detector.

During the processing the pipeline examines various features of the data and produces a number of flags. These range from simple information flags such as whether the observations were made in slow scan mode to complex measures of the scale of the transient effect.

All the flags generated by the pipeline begin L01 and appear at the end of the primary header. The first three flags appear just once while the others are given for each raster. The first keyword 'L01PROC' identifies the version of the pipeline used for processing, e.g.

which identify raster, tracking and pointed observations respectively. For raster observations the final raster number is given in the comment field. Tracking observations are all processed as scans of pointed observations and this is made clear in the comment field.

The following keywords appear for each raster. The 'L01RAST' keyword identifies the particular raster position to which the following keywords refer, e.g.

Slow scan observations are identified by the 'L01SLOW' keyword and the flag is set to either 'T' or 'F' depending on whether the observation is in slow scan mode or not. If the observation is in slow scan mode then the comment field gives the number of samples at each wavelength position, e.g.

Observations with low sampling rates are identified by the 'L01SAMP' keyword. If the sampling rate is < 4 then the flag is 'T' otherwise it is 'F'. The comment field reinforces the flag and gives the value of the sampling rate, e.g.

The 'L01SCANS' keyword identifies observations with a low number of scans. If there are < 4 scans then the flag is 'T' otherwise it is 'F'. The comment field gives the effective number of scans in the forward and reverse directions. For this purpose, i) fragments of scans are not counted, ii) in slow scan mode each repeated set of observations is treated as an additional scan, and iii) in tracking observations each pointing contributes to the scan count, e.g.

Badly glitched observations are identified by the 'L01GLITC' keyword. The pipeline examines the fraction of data with the glitch flag set and if this exceeds 40% then the flag is set 'T' otherwise it is 'F'. The comment field indicates how many detector are affected, e.g.

Badly fringed observations are identified by the 'L01FRING' keyword. The pipeline examines the level of fringing in each detector and if this thought to be significant then each detector is de-fringed independently of the others. If the fringing is severe in any detector then, in addition, the flag is set to 'T'. The flag is set to 'F' only if the fringing is not severe, but some detectors may still be de-fringed. The comment field contains a set of 10 T/F characters that identify which detectors have been de-fringed, in order from SW1-LW5, or if no detectors have been de-fringed, e.g.

Observations badly affected by transient effects are identified by the 'L01TRANS' keyword.but the pipeline makes no attempt to correct for them. The flag is set 'T' if this is thought to be significant in any detector, otherwise it is 'F'. The comment field contains a set of 10 T/F characters that identify which detectors have large transient effects, in order from SW1-LW5, e.g.

The 'L01OVER' keyword identifies observations where there are extreme differences in the fluxes in the overlap regions between adjacent detectors. The flag is set 'T' if there are significant differences in at least two cases, excluding SW1, otherwise it is 'F'. The comment field contains a set of 9 T/F characters that identify which detector pairs have significant differences between them, in order from SW1/SW2-LW4/LW5, e.g.

In addition to the flags set by the pipeline there are two other flags that may appear in the header of the pipeline product. These relate to the scan numbering problem and the NIR leak. In both cases these problems have been corrected in the input LSAN files prior to processing by the pipeline. If either problem has been fixed the either the 'LSANNEW' or 'LNIRLREM' keyword will appear in the header, with the flag set to 'T', e.g..

The pipeline generates two output products. The first is a full LSAN structure that may be read into ISAP and the second is an ASCII version of the data with a reduced number of fields. Both contain the full primary and secondary headers, which contain the flag information.

There are three output spectra, the mean of the forward and reverse scan directions, and the final merged spectrum. These are given so the user can clearly identify any transient effects. Also, the correction for the SW1 double-peaked feature is only applied to the final spectrum so the user can examine the uncorrected spectra if necessary.

The full LSAN structure is given but some of the fields are not used, and set to zero. These are the time fields and the wavelength uncertainty. The status word contains the full status information of the individual fluxes that went into the average value in an 'OR' combination. The scan number and scan count have the values 0, 1 and 2 for the forward, reverse and combined spectrum, respectively.

**Table 6:** *LWS LSAN file record structure.*
Field	Unit	Description	Not/used
LSANUTK	zero	UTK time	Not used
LSANRPID	integer, integer	Raster Point ID	Used
LSANFILL	0	Filler	Not used
LSANLINE	1	Line number	Not used
LSANDET	integer	Detector ID	Used
LSANSDIR	integer	Scan direction	Used
LSANSCNT	integer	Scan count	Used
LSANWAV	microns	Wavelength	Used
LSANWAVU	zero	Uncertainty in wavelength	Not used
LSANFLX	W/cm^2/micron	Flux on detector	Used
LSANFLXU	W/cm^2/micron	Flux uncertainty	Used
LSANSTAT	bits	Status word	Used
LSANITK	zero	ITK time	Not used

The ASCII file contains only six fields but these should be sufficient to analyse the spectra. A routine is provided to read this structure into IDL. To download it click here.

The LWS LO1 Pipeline

Version 1, December 2003

Acknowledgements

Documentation

Introduction

Properties of L01 data

The data

The process

Flags

Output products