PICODIVa European FP5 program
Monitoring the diversity of photosynthetic picoplankton in marine waters

Databases

Quick Access:
Note: To download files, use the right button (not the left) of your mouse and select "Download link" from pop up menu
 
Format
Download
Last update
Size (Mb)
rRNA sequences ARB database ARB release june 2002 SSURNA_PICODIV.arb

PICODIV updates 1 to 9
(all sequences, to use with above file)
SSURNA_PICODIV.a09

Both zipped SSURNA_PICODIV.zip

PICODIV update 9 (only new sequences) SSURNA_PICODIV09.arb
PICODIV update 8 (only new sequences) SSURNA_PICODIV08.arb
PICODIV update 7 (only new sequences) SSURNA_PICODIV07.arb
Inventory MS Access RNA_algae.zip

rRNA probes   Web Go to database

 Species taxonomy   MS Access Phytoplankton_species.zip

 Cultures RCC catalog MS Access RCC_catalog.zip

 Coastal sites MS Access Coastal_sites.zip

New information added:

Conditions of use: All data remain the propriety of the group/individual that generate them.  Therefore they cannot be used without his/her consent.  All partners of the project have agreed that anyone who use someone else data should propose co-authorship to that person in resulting papers.

Distribution: All data from these databases are strictly for use within the PICODIV genome project and it is therefore forbidden to distribute them to third-parties.

Localisation: All databases are physically located on the Roscoff site.

Format: All major primary data bases are developped under Microsoft Access 97. This software has the advantage of being widely available as well as to be programmable fairly simply using Visual Basic. Sequence database are managed with the ARB software.

Update procedure: As soon as they become available, new data (sequences, probes etc...) are sent to Roscoff by email using special templates available through the Web (see below). These data are included in the corresponding master database, accessible also through the Web.

 Download presentation about databases (june 2000).

Probe database

Primary database: Access format

Web database: Flat text file with PHP interface (RNA_probes.php3)

To submit new probes:

  1. download template file: RNA_probes_template.xls
  2. Fill in file without adding or removing any column
  3. email file to Daniel Vaulot

Taxonomy database

How to use the database

  1. Download the file Phytoplankton_species.zip
  2. Unzip the files keeping the directory structure as it is (select the correct WinZip option).  The main Access file (Phytoplankton species.mdb) should be in a directory called \Cultures while all pictures are in a subdirectory called \Cultures\Phytoplankton images. If the directory structure is modified, then you will not be able to see any of the picture.
  3. Open Microsoft Access
  4. Go to the menu File/Open Database
  5. Select \Cultures\Phytoplankton species.mdb
  6. You cannot open the database by double cliclking from the file manager because the pictures will not show up

How to add and edit data

Warning: In Microsoft Access, any change made to a field is immediately taken into account and saved.  There is no way to undelete changes once made. The only possibility is to keep always a replica of the database with a different name that can be reloaded in case of problem.

Pictures: Scan pictures at 75 or 100 dpi resolution. Using Photoshop or a similar programm create a compound image regrouping the key features of the species of interest (drawing, light microscope, whole mount, section). Keep the compound image small enogu that it can fit on a screen.  Add scale on each part of the figure and label the scale directly on the picture. Save as JPEG file at x10 compression.  The name of the file should be as follows: Genus_Species.jpg and the file should reside in the directory \Cultures\Phytoplankton images.

Fields: Species number is an automatic field that gets incremented everytime a new species is added. Date entered is automatically field with the current date for any new species added to the database.Some fields such as Length min, max etc... can only accept numeric values.

Pigment data: To add pigment data, move to the subform at the bottom of the species form.   Enter first the species number for which you have pigment data and then check the different pigments present.

Remarks

The taxonomy database is not restricted to the sole picoplanktonic species.  In particular, species from some of the major picoplankton classes (Prymnesiophytes, Prasinophytes, Pelagophytes) have been added and we will try to be as complete as possible.

Culture database

How to use the database

  1. Download the file RCC_catalog.zip
  2. Unzip the files.
  3. Double click on file Cultures de Roscoff.mdb from FileManager

Coastal sites (only Roscoff site at present)

Content

  • Roscoff site
    • Samples taken
    • Hydrology (temp, sal, nutrients, chlorophyll, diatoms)
    • Pigments
    • Enrichment cultures

How to use the database

  1. Download the file Coastal_sites.zip
  2. Unzip the files.
  3. Roscoff site: Double click on file English Channel.mdb from FileManager

Sequence databases

Update history

As a general rule, only full length sequences from EMBL have been added to the database.  In special cases, e.g. for plastid SSU sequences or some environmental sequences, partial sequences have been added.

Release

Date released

Sequences added

Eukaryote

Prokaryotes

 
 
  

nucleus

mito

plastid

nucleo
morph

Synechococcus
Prochlorococcus

Other
Bacteria

.09 7/03/04
EMBL sequences
  • Environmental Bay of Fundi
  • Environmental Station Jericho
  • Haptophyta
  • Rhodophyta

PICODIV

  • PROSOPE cruise DGGE et TTGE (D. Marie)
  • Arabian Sea cultures (plastids, N. Fuller)
270 20
.08
9/05/03
EMBL sequences:
  • Loboesea
  • Prasinophyceae Zingone
  • Two environmental sets from Stoeck et al
  • Stramenopiles Massana et al.

PICODIV clone libraries

  • Full sequences from Roscoff, Blanes Helgoland
  • Cyanobacteria sequences
  • Clones and DGGE from Blanes experiments (Laure)
  • DGGE Vietnam (Laure)

PICODIV Cultures

  • RCC (partial and full)
  • Helgoland (SSCP)
  • 16S Plastid sequences
558
(tree)
60
(tree)
45
(tree)
8
.07
26/07/02
EMBL sequences:
  • Pinguiophyceae
  • Dinophyceae
  • Cryptophyceae
  • Plastid sequence
  • more...

PICODIV Cultures

  • RCC (partial)
  • Blanes (Laure)
198 18 9
.06
17/06/02
EMBL sequences from clone libraries:
  • Sogin: deep sea sediments (PNAS)
  • Pace: marine sediments (PNAS)
  • Amaral Rio Tinto (Nature)

PICODIV clone libraries

  • Full sequences from Roscoff
  • Blanes BL010320

PICODIV Cultures

  • RCC (partial and full length)
  • Bremerhaven SSCP (Klaus)
  • Pirsonia from Klaus and Stefanie Kuhn
  • Blanes (Laure)
695
(tree)

.05

20/02/02

EMBL update: Euk (in particular Rhodophyta), Syn, Proc
Euk Clone libraries:

  • Blanes BL000921 (corrected) and BL010625
  • Assignement of Roscoff library sequences corrected

PICODIV Cultures

  • RCC (partial and full length)
  • Blanes (partial)

565

 

28

4
(leuco-plast)

5

 

.04

05/09/01

Euk Clone libraries:

  • Roscoff July 01
  • Public data from Lopez-Garcia et al 2001

PICODIV Cultures

  • RCC

124

         

.03

25/07/01

PICODIV Euk Clone libraries:

  • Roscoff Dec 00, April and May 01
  • Helgoland Dec 00 Feb 01
  • Blanes Sept, Dec

PICODIV Cultures

  • RCC
  • Synechococcus strains

PICODIV 16S Clone libraries:

  • PROSOPE DYF and MIO
  • Helgoland August
  • Blanes Sept

405

 

19

 

35

2

.02

19/03/01

EMBL update: Euk, Syn, Proc
PICODIV Euk Clone libraries:

  • Roscoff Juin, Sept
  • Helgoland Oct
  • Orkney April
  • Antarctic; North Atlantic and Mediterranean Sea

PICODIV 16S Clone libraries:

  • PROSOPE DYF and MIO
  • Helgoland August

378

 

12

5

87

8

.01

29/12/00

EMBL update: Euk, Syn, Proc
PICODIV Euk Clone libraries (partial sequences):

  • Roscoff April
  • Helgoland March, April, August

309

 

29

6

1

 

.arb

28/7/00

EMBL Euk, Syn, Proc
OLIPAC unpublished
Euk Roscoff unpublished

714

4

26

7

52

 

Critical information (all must read)

Aligned sequences: main database (ARB format)

This database is based upon the ARB database available from the ARB site in Bremen. We are now using the June 2002 version.

The PICODIV ARB file is named SSURNA_PICODIV.arb. It is periodically updated with both public sequences and sequences from the PICODIV project. New sequences are aligned to the closest relative found by the PT-SERVER function and then added to partial parsimony trees using the "Quick add to tree" ARB function. If you are interested, you can find below the detailed procedure for updating the database.

In order to implement the database on your computer, you need to:  

How to add your own fasta sequences to your ARB database
How to update your own database without overwriting sequences that you have re-aligned

How to submit sequences

All sequences must be submitted as FASTA files.

    1. The first line contains the identifier for the sequence, for exemple the clone library name followed by the clone number
      e.g. >HE000420.015
      corresponds partial sequence of Helgoland clone #15 from library of 20 april 2000
    2. Next lines contains the sequence itself

  • Other information (experts only)

    ARB database structure

    Trees

    Three subtrees have been extracted from the general tree (tree_tre_1000_may02) and new sequences added to them (this is much faster):

    Tree name
    Method
    of addition
    Sequences
    Type of sequences added
    • tree_PICODIV_euk
    Parsimony
    > 5 000
    nuclear: mostly lower eukaryotes
    • tree_PICODIV_cyano_plastid
    Parsimony
    > 500
    cyanobacteria and plastids
    • tree_PICODIV_bact
    Parsimony
    > 20 000
    mito, bacteria

    ARB database fields

    Please note that some of the sequences (Accession number ZZ....) are still unpublished and were acquired prior to the PICODIV project.  Therefore, they cannot be used freely.

    PT-Server

    PT-server are critical elements of ARB they allow alignement of novel sequences, finding closest relatives as well as designing and checking novel probes. We are currently using two PT-Server for the SSURNA_PICODIV database (to set PT-Server you need to modify (as user root) the end of the text file /usr/arb//lib/arb_tcp.dat.

    • SSU_rRNA_PICODIV: This PT server is used for all function except alignement.  It is updated with all the available sequences.
    • SSU_rRNA_align: This PT server is used only for alignement.  It has been created from the original ARB database of June 2002. It is NOT updated with the novel environmental sequences so that the original alignement is not polluted by misaligned novel sequences.

    Sequence inventory (Access format)

    This database contains all the rRNA sequences available for the lower eukaryotes as well as PICODIV sequences as they become available. The Access format allows easy manipulation, selection and formatting of the data.

    How to add new sequences to the database (Administrator)

    Note: The following is only for the person who maintains the database (Daniel for the time being).  Other need not worry about this

    Query of EMBL server to get recently released public sequences

    Update of RNA algae.mdb

    - Load Access database RNA algae.mdb
    - Follow steps outlined in main menu of RNA algae Access database

      1. Import sequence as fasta file (text file in PC format, not UNIX) - Make sure that the Date and PICODIV release number are OK.
      2. Update all necessary fields.  In particular check the following fields for each new sequence:
          • Molecule: SSU, LSU etc...
          • Encoded: nucleus, plastid, mitochondria etc...
          • Partial: check box if the sequence is partial
          • Environmental: check box if the sequence is from a clone library
          • Strain, RCC, Clone number : for culture sequences
          • Cruise, Station, Depth, Clone_library, Clone number : for clone library sequences
          • Unpublished: true
          • Method : clone library, DGGE, SSCP
          • Author: put the name of the person who did or communicated the sequences
          • Sequencing company : Qiagen, GenoMer etc....
      3. Export sequences in EMBL format. All the sequences exported into a single pseudo EMBL file.  This file can now be imported into ARB (see below).
    • EMBL sequences
      1. Import sequence as EMBL file (text file in PC format, not UNIX) - Make sure that the Date and PICODIV release number are OK.
      2. Update all necessary fields.  In particular check the following fields for each new sequence:
          • Molecule: SSU, LSU etc...
          • Encoded: nucleus, plastid, mitochondria etc...
          • Partial: check box if the sequence is partial
          • Environmental: check box if the sequence is from a clone library
          • Strain : for culture sequences
          • Clone_library, Clone number : for clone library sequences
          • Method : clone library, DGGE, SSCP
          • Taxonomy (Phylum, class etc...): The program tries to fill these data automatically but sometimes this does not work that well

      Update SSURNA_PICODIV.arb


    • Last updated 08 March 2004

      PhytoPlankton