PAZAR Documentation
Data formats » PAZAR XML format » Step-by-step documentation

PAZAR XML format

Step 3 —Capturing the evidence linking a sequence to a TF or to a specific expression

This step starts inside an existing 'data' element. At this point, the 'reg_seq', 'funct_tf' and/or 'construct' elements should have been defined in this 'data' element (see Step 2).

3.1. Capturing the experiment information

The "data" element stores all the annotations describing the cell, time, condition, etc.

<cell name="Y79" pazar_id="ce_0001" species="Homo sapiens" status="cell__line"/>
<time name="24-28" pazar_id="ti_0001" scale="stages of embryogenesis"/>
<condition pazar_id="cd_0001" cond_type="coexpression" molecule="transcription factor" concentration="1:1" scale="ratio"/>

Note: replace the red values with your own information. The pazar IDs are internal IDs that will not be stored. They can be anything as long as they are unique throughout the file.

3.2. Capturing the interaction or expression information

The "data" element also stores the description of the interaction and/or expression quality.

<expression pazar_id="ex_0001" quantitative="23" scale="percent"/>
<interaction pazar_id="in_0001" qualitative="good"/>
<interaction pazar_id="in_0002" qualitative="none"/>
<interaction pazar_id="in_0003" quantitative="14" scale="percent"/>

Note: replace the red values with your own information. The pazar IDs are internal IDs that will not be stored. They can be anything as long as they are unique throughout the file.

3.3. Linking it all together

The "data" element can now be closed. All the data stored in it will be linked through  "analysis" elements using the pazar_ids as IDREFS. An "analysis" element stores an experiment information, linking sequences and factors (inputs) to an interaction or expression result (output). There can be as many "analysis" element in a "pazar" element as needed. The cell and time are called as attributes of the "analysis" element. The evidence, method and ref are children elements of the "analysis" element. The sequences and factors (always use a "funct_tf" element) studied are called as attributes of the "input" element. The interaction or expression descriptions are called as attributes of the "output" element.

Thus the example below describe a SELEX experiment with a TF (pazar_id="fu_0001") binding to 2 different artificial sequences (pazar_ids="co_0001" and "co_0002"), with 2 different levels of interaction (pazar_ids="in_0001" and "in_0002") -> 2 'input_ouput' elements: the first describes the interaction of the TF with the first sequence, the other describes its interaction with the second sequence.

Please look at the 3 PAZAR XML examples available on the main page if you need other examples.

  </data>
  <analysis name="analysis_example1"
    <evidence type_evid="curated" status_evid="provisional"/>
    <method method="SELEX"/>
    <ref pmid="7936637"/>
    <input_output>
      <input inputs="fu_0001 co_0001"/>
      <output outputs="in_0001"/>
    </input_output>
    <input_output>
      <input inputs="fu_0001 co_0002"/>
      <output outputs="in_0002"/>
    </input_output>
  </analysis>

Note: replace the red values with your own information. The pazar IDs are internal IDs that will not be stored. They can be anything as long as they are unique throughout the file.

3.4. The end

Once all the data has been entered in the 'data' element and linked together through multiple 'analysis' elements, the 'pazar' element can be closed and the XML file is finished.

</pazar>