description: Calculates values for various cooccurrence metrics. The expected inputs are triples containing as subject the var name and as object the count. Marginal counts must appear first, joint counts second. Marinal counts must be written as 1:A, Joint counts as 2:A&B
description: By default changes the record ID to the value of the ‘_id’ literal (if present). Use the contructor to choose another literal as ID source.
description: Decode HTML to metadata events. The attrValsAsSubfields option can be used to override the default attribute values to be used as subfields (e.g. by default link rel="canonical" href="http://example.org" becomes link.canonical). It expects an HTTP-style query string specifying as key the attributes whose value should be used as a subfield, and as value the attribute whose value should be the subfield value, e.g. the default contains link.rel=href. To use the HTML element text as the value (instead of another attribute), omit the value of the query-string key-value pair, e.g. title.lang. To add to the defaults, instead of replacing them, start with an &, e.g. &h3.class
description: Decodes JSON to metadata events. The ‘recordPath’ option can be used to set a JsonPath to extract a path as JSON - or to split the data into multiple JSON documents.
description: Parses pica+ records. The parser only parses single records. A string containing multiple records must be split into individual records before passing it to PicaDecoder.
description: Outputs the name and value of each literal which is received as a string. Name and value are separated by a separator string. The default separator string is a tab. If a literal name is empty, only the value will be output without a separator. The module ignores record and entity events. In particular, this means that literal names are not prefixed by the name of the entity which contains them.
description: Encodes a stream into MARCXML. If you can’t ensure valid MARC21 (e.g. the leader isn’t correct or not set as one literal) then set the parameter ensureCorrectMarc21Xml to true.
description: A generic XML reader. Separates XML data in distinct records with the defined record tag name (default: recordtagname="record") If no matching record tag is found, the output will be empty. The handler breaks down XML elements with simple string values and optional attributes into entities with a value subfield (name configurable) and additional subfields for each attribute. Record tag and value tag names can be configured. Attributes can get an attributeMarker.
description: A MARC XML reader. To read marc data without namespace specification set option namespace="". To ignore namespace specification set option `ignorenamespace=”true”.
description: Lists all paths found in the input records. These paths can be used in a Fix to address fields. Options: count (output occurence frequency of each path, sorted by highest frequency first; default: true), template (for formatting the internal triple structure; default: ${o} | ${s} if count is true, else ${s})index (output individual repeated subfields and array elements with index numbers instead of ‘*’; default: false)
options: template (String), count (boolean), index (boolean)
description: Lists all values found for the given path. The paths can be found using fix-list-paths. Options: count (output occurence frequency of each value, sorted by highest frequency first; default: true)template (for formatting the internal triple structure; default: ${o} | ${s} if count is true, else ${s})
description: Opens an OAI-PMH stream and passes a reader to the receiver. Mandatory arguments are: BASE_URL, DATE_FROM, DATE_UNTIL, METADATA_PREFIX, SET_SPEC .
description: Uses the object value of the triple as a URL and emits a new triple in which the object value is replaced with the contents of the resource identified by the URL.
description: Sorts triples. Several options can be combined, e.g. by="object",numeric="true",order="decreasing" will numerically sort the Object of the triples in decreasing order (given that all Objects are indeed of numeric type).
options: by [SUBJECT, PREDICATE, OBJECT, ALL], numeric (boolean), order [INCREASING, DECREASING]
description: Emits the literals which are received as triples such that the name and value become the predicate and the object of the triple. The record id containing the literal becomes the subject. If ‘redirect’ is true, the value of the subject is determined by using either the value of a literal named ‘_id’, or for individual literals by prefixing their name with ‘{to:ID}’. Set ‘recordPredicate’ to encode a complete record in one triple. The value of ‘recordPredicate’ is used as the predicate of the triple. If ‘recordPredicate’ is set, no {to:ID}NAME-style redirects are possible.
description: Builds a String from a template and an Object. Provide template in brackets. ${o} marks the place where the object is to be inserted. If the object is an instance of Triple ${s}, ${p} and ${o} are used instead.
description: Validate JSON against a given schema, send only valid input to the receiver. Pass the schema location to validate against. Write valid and/or invalid output to locations specified with writeValid and writeInvalid. Set the JSON key for the record ID value with idKey (for logging output, defaults to id).
description: Writes the object value of the triple into a file. The filename is constructed from subject and predicate. Please note: This module does not check if the filename constructed from subject and predicate stays within baseDir. THIS MODULE SHOULD NOT BE USED IN ENVIRONMENTS IN WHICH THE VALUES OF SUBJECT AND PREDICATE A PROVIDED BY AN UNTRUSTED SOURCE!
description: Writes the XML into the filesystem. The filename is constructed from the XPATH given as ‘property’. Variables are:target (determining the output directory), property (the element in the XML entity. Constitutes the main part of the file’s name.), startIndex ( a subfolder will be extracted out of the filename. This marks the index’ beginning ), stopIndex ( a subfolder will be extracted out of the filename. This marks the index’ end )