(See these queries and more in a single SQL file)
1. The Arctos SQL interface tool is here
2. Write your SQL in an editor with syntax coloring, then paste it into the box. Good, free coding editors are BBEdit for Mac, and NotePad++ for Windows, but there are thousands of others to choose from.
3. As an Arctos user (i.e., not a developer) we can only use the
SELECT
statement of SQL (Not INSERT
, DELETE
, UPDATE
…). This
greatly reduces what you have to learn. Access via SQL is read only
so you cannot mess anything up!
4. You will find that almost all the data you need to query is in a
single table in Arctos called flat
- this is a composite, flattened
table made from the tens (hundreds) of tables in Arctos. This table also
greatly reduces what you have to learn (at first) - you do not need to
make cross-table JOIN
s in your SQL. To see the fields in flat
enter:
SELECT * FROM flat LIMIT 1
(do not omit the LIMIT 1
!) Or… see the field list here.
5. A basic SELECT
statement goes like this:
SELECT <output fields>
FROM <table>
WHERE <some set of logical filters>
ORDER BY <some output fields to sort by>
LIMIT <how many lines you want>
WHERE
and ORDER BY
and LIMIT are optional, but be careful not
to ask for a query that gives 100 MBs of results by mistake; use
LIMIT
as you develop your SQL.An example:
SELECT guid, scientific_name, family
FROM flat
WHERE (dec_lat > 50 AND
dec_lat < 60 AND
dec_long < -160 AND
dec_long > -170)
AND began_date > '2010-01-01'
AND guid_prefix = 'UAM:Herb'
ORDER BY family, scientific_name
LIMIT 100
6. Click the CSV button if you want to download the results.
7. Sometimes you will find that not all the data you need is in
the flat
table.
There are hundreds of tables in Arctos, linked together via shared
field values and identifiers. The basic goal in a relational database
is to “normalize” the data, which means that no information element is
stored redundantly (e.g., in two or fields or two records). For
example, if there are two collections from the same place, the
longitude and latitude of that place should not be stored twice in the
same collections table. Note: the cached flat
table is
denormalized table with lots of redundancy of this kind.
The link between tables can be visualized in an Entity-Relationship (“ER”) diagram. There is no recent ER diagram for Arctos, but here is one from 2013. You can however see a list of tables here; in the section ‘Constraints’ below the fields you can see which fields (‘OriginatesFrom’) link to fields in other tables (‘ReferencesColumn’).
See the SQL examples for how to JOIN
tables.
8. As you start to write WHERE
filters you will probably soon
find Regular Expressions to be invaluable. These strings of
wildcard symbols allow the matching of just about any substring you
can imagine. In PostgreSL, you can apply them with the ~
operator. E.g. WHERE guid ~ 'UAMb?:(Herb|ALA)'
will match ‘UAM:Herb’,
‘UAMb:Herb’ and ‘UAM:Alg’.
9. Even though this are ‘next level’ SQL, be aware that you can
nest SELECT
statements to query a table you create with an interior
query: SELECT * FROM (SELECT ....) AS a WHERE a.x ... ;
. Powerful
stuff!
10. More resources: