Package com.aliasi.medline

Classes for manipulating MEDLINE data.

See:
          Description

Interface Summary
MedlineHandler The MedlineHandler interface specifies a single method that applies to a MEDLINE citation.
 

Class Summary
Abstract An Abstract represents the abstract of a MEDLINE document.
Article An Article represents the content of the Article element of a MEDLINE citation.
ArticleDate An ArticleDate represents the date on which the publisher created an electronic version of an article.
Author An instance of Author represents an author of a paper, including information about name, affiliation, title, and various qualifiers and dates.
AuthorList An AuthorList represents the authors of a particular article.
Chemical A Chemical indicates a chemical that is determined by a name and registry number.
CommentOrCorrection A CommentOrCorrection object represents one of the many possible comments or corrections that have been applied to an article.
DataBank A DataBank represents a set of molecular sequences registered in a particular database.
DataBankList A DataBankList contains linkages of molecular sequences mentioned in articles to their data bank name and accession numbers.
ELocationId An ELocationId represents the pagination type information for an electronic publication.
GeneralNote A GeneralNote represents supplemental or descriptive information related to the record.
Grant A Grant represents a particular instance of a grant.
GrantList A GrantList provides information about the grants that funded the work repored in the article.
Investigator An Investigator represents a funded principal investigator for the (United States) National Aeronautics and Space Administration (NASA).
Journal A Journal represents a particular issue of a journal.
JournalInfo A JournalInfo object contains an abbreviation for a journal's title, and optionally a country and optionally a unique NLM identifier.
JournalIssue A JournalIssue contains information about a particular issue of a journal, including publication date and optionally volume and issue number.
KeywordList A KeywordList consists of a set of topics with a specified owner.
MedlineChars The MedlineChars class contains static methods for handling characters in MEDLINE.
MedlineCitation A MedlineCitation represents the content of a single record in the 2008 MEDLINE database for the citation of an individual article.
MedlineCitationSet The MedlineCitationSet provides static constants for the XML elements, attributes and constant values used in MEDLINE.
MedlineParser A MedlineParser is able to parse 2009 MEDLINE citations from an input source.
MeshHeading A MeshHeading represents a particular heading in NLM's controlled vocabulary of Medical Subject Headings (MeSH).
Name A Name is a structured record of a person's first, middle, last name and name suffixes, along with a standardized set of initials.
OtherAbstract An OtherAbstract represents an alternative abstract for an article.
OtherID An OtherID provides an alternative identifier from a specified source.
PersonalNameSubject A PersonalNameSubject is provided for citations that contain a biographical note or obituary about a given individual.
PubDate A PubDate represents a publication date in a semi-structured or unstructured format.
Topic A Topic consists of a string-based topic and an indication as to whether the topic is a major topic for an article.
 

Package com.aliasi.medline Description

Classes for manipulating MEDLINE data. The classes in this package are able to read the MEDLINE database from its gzipped distribution format and render them completely as structured Java objects.

The basic method for handling the complete set of MEDLINE citations with Java is based on the visitor pattern, as described in the class documentation for MedlineCitationSet.

MEDLINE contains citations to roughly 15 million articles drawn from books and journals on the broad topic of biomedicine dating from 1966 to the present. It is updated with new citations five times weekly. Roughly 500,000 new citations are added each year (that's 10,000/week, or 2,000 per update). MEDLINE was created and is maintained by the Bibliographic Services Division (BSD), a part of Library Operations at the (United States) National Library of Medicine (NLM). The National Library of Medicine is itself a division of the National Institute of Health (NIH). MEDLINE data is free for just about any purpose, including serving data and as the basis of commercial applications.

Thorough documentation for the content of a MedlineCitationSet document is provided by NIH in the document:

MEDLINE XML Element Descriptions and Their Attributes.

Because the MEDLINE data format changes on a yearly basis, the classes in this package will also change yearly. Rather than trying to version this package by year of release, it will be kept current with the latest version of MEDLINE. This means that there is no guarantee of backward compatibility for these classes as the MEDLINE yearly cycle changes. This document is based on the version of MEDLINE distributed during 2008.

Our own benchmarks indicate that it will take roughly 4 hours to visit each MEDLINE citation on a modern desktop PC running Java in server mode. The memory required for the parsing and visiting itself is negligible, being just enough to do the XML parsing and hold a single citation after being constructed.

For general information on MEDLINE, see: