User GuidesAPI ReferenceRelease Notes
Doc HomeHelp CenterLog In
User Guides

Working with Dates

Convert timestamps to the ISO 8601 date format, add a time attribute to a date attribute to get a single timestamp, and obtain the number of days between two dates using transformation functions.

You can use the following formats and functions to work with dates.

Datetime Formats

Tamr Core uses the following standards:

You can specify how date and time values are formatted in your data. If your data does not use one of the Predefined Date Formats, you use the following syntax to provide the format that is used:

  • Date and time must be separated by a colon : character, such as dd/MM/yyyy:HH:mm.
  • Other separators that occur between values, such as - or spaces, can be entered as they appear in your data.
  • Times that use the 12-hour (AM/PM) clock are converted to the 24-hour timekeeping system.

You use the following case-sensitive placeholders to represent the components of date and time values:

  • dd is a zero-padded two-digit day, such as 25. d is valid for single-digit days.
  • MMM is a three-letter month, such as JAN.
  • MM is a zero-padded two-digit month, such as 01. M is valid for single-digit months.
  • yyyy is the four-digit year, such as 1999.
  • HH is the two-digit hour of a 24-hour clock time, such as 17.
  • hh is the two-digit hour of a 12-hour clock time, such as 08 (see a below for AM/PM).
  • mm is the two-digit minute, such as 01.
  • ss is the two-digit second, such as 30. S indicates additional units; add one S for each decimal place.
  • Z is the timezone in UTC. You can also use +HH:mm or -HH:mm for the UTC offset, such as −05:00 for New York on standard time.
  • a, including the leading space, is the AM/PM indicator for times that use the 12-hour clock.

Converting Timestamps to ISO 8601

The following functions convert timestamps (that is, a date, time, or both) in a variety of formats to the ISO 8601 date format. To convert timestamps in different formats to the ISO format, these functions apply any optional formats that you specify to your data first, and then use a predefined list of standard, frequently-used formats. As a result, these functions can convert timestamps in multiple formats to ISO 8601 format.

important Important: Prior to v2020.018, the datetime_to_iso and date_and_time_to_iso functions required explicit definitions for the datetime formats to convert. Beginning with v2020.018, these functions automatically use the predefined list of datetime formats in addition to your explicitly defined formats. As a result, the output of these functions may change.

datetime_to_iso

If your timestamp is a single datetime string stored in one attribute, this function converts it to ISO 8601 format. If you explicitly specify one or more optional formats, you do not have to include nanoseconds. If you do, and a datetime attribute does not include them, they are filled with zeroes.

Examples:

The example that follows relies on the list of predefined formats to convert your timestamps to the ISO format.

SELECT *,
  datetime_to_iso(get(<datetime_attribute>, 0), 
  array()) 
AS <iso_formatted_timestamp>

The example that follows explicitly specifies two datetime formats to convert before any formats that match those specified in the predefined list. The format array is case-sensitive, so notice that the input dataset uses the 24-hour clock for times.

SELECT *,
  datetime_to_iso(get(<datetime_attribute>, 0), 
  array('yyyy-MM-dd-:HH:mm:ss.S','yyyy-MM-dd :HH:mm:ss.S')) 
AS <iso_formatted_timestamp>

For more information, see DATETIME_TO_ISO.

date_and_time_to_iso

If your timestamp is separated into two different attributes for date and time, you can use this function to convert them to an ISO 8601 timestamp in a single attribute. If you explicitly specify one or more optional formats to convert, you must include the colon : separator even if there are no values for one of the input attributes.

Examples:

The example that follows relies on the list of predefined formats to convert values to the ISO format.

SELECT *,
  date_and_time_to_iso(get(<date_attribute>, 0), get(<time_attribute>,0), 
  array())
AS <iso_formatted_timestamp>

The example that follows explicitly specifies a date format to convert before any formats that match those supplied in the predefined list. In this example, the input dataset uses the 12-hour clock for times.

SELECT *,
  date_and_time_to_iso(get(<date_attribute>, 0), get(<time_attribute>,0), 
  array('dd MM yyyy:', 'dd/MM/yyyy:hh:mm a'))
AS <iso_formatted_timestamp>

For more information, see DATE_AND_TIME_TO_ISO.

Predefined Date Formats

Tamr Core automatically converts values in the following formats to ISO 8601 format. If your timestamps use a format that does not appear on this list, or if formats should be converted in a different sequence than listed below, you should specify that format in the array().

For example, if your timestamps include month, day, and year values in the format MM dd yyyy, be sure to specify it. That format is not included in the predefined list.

Tip: To find any records that have a timestamp value in a format that is not listed here, you can preview the effect of using the function with an empty array() first, then filter to records that result in a null to explore the format of the original timestamp. You can then explicitly include that format in the array.

Only the year is specified:

datetime_to_iso, time is emptydate_and_time_to_iso, time is empty
'yy''yy:'
'yyyy''yyyy:'
'-yyyy''-yyyy:'
'----yyyy''----yyyy:'
'-----yy''-----yy:'
'-----yyyy''-----yyyy:'
'-------yyyy''-------yyyy:'
'------yyyy''------yyyy:'

Only the month and year are specified:

datetime_to_iso, time is emptydate_and_time_to_iso, time is empty
'--MMMyyyy''--MMMyyyy:'
'--MMMyy''--MMMyy:'
'-MMMyyyy''-MMMyyyy:
'-MMMyy''-MMMyy:'
'MMMyyyy''MMMyyyy:'
'yyyyMMM''yyyyMMM:'
'yyyy-MMM''yyyy-MMM:'
'yyyy-MM''yyyy-MM:'

Only the day, month and year are specified:

This table presents the formats that each function attempts to convert, in the sequence attempted. Blank cells indicate differences between the lists the functions use to convert day, month, and year values without a time.

datetime_to_iso, time is emptydate_and_time_to_iso, time is empty
'dd-MMM-yy''dd-MMM-yy:'
'd-MMM-yy''d-MMM-yy:'
'd-MM-yyyy''d-MM-yyyy:'
'dd-MM-yyyy''dd-MM-yyyy:'
'dd-MMM-yy''dd-MMM-yy:'
'ddMMMyy''ddMMMyy:'
'ddMMMyyyy''ddMMMyyyy:'
'dMMMyy''dMMMyy:'
'dMMMyyyy''dMMMyyyy:'
'yyyy-MM-d''yyyy-MM-d:'
'yyyy-MM-dd''yyyy-MM-dd:'
'yyyyMMd''yyyyMMd:'
'yyyyMMdd''yyyyMMdd:'
'd MM yy'
'dd MM yyyy''dd MM yyyy:'
'd MM yyyy''d MM yyyy:'
'dMMMyy''dMMMyy:'
'dMMMyyyy''dMMMyyyy:'
'ddMMMyy''ddMMMyy:'
'ddMMMyyyy''ddMMMyyyy:'
'dd MMM yy'
'dd MMM yyyy'
'dd-MMM-yy'
'dd-MMM-yyy'
'd-MMM-yy'
'd-MMM-yyyy’

Only the year and time are specified:

  • 'yyyy:HH:mm:ss'
  • 'yy:HH:mm:ss'

Only the month, year and time are specified:

  • 'yyyyMMM:HH:mm:ss:'
  • '--MMMyyyy:HH:mm:ss'
  • 'MMMyyyy:HH:mm:ss'

The day, month, year, and time are specified:

  • 'yyyy-MM-d:HH:mm:ss'
  • 'yyyy-MM-dd:HH:mm:ss'
  • 'yyyyMMM:HH:mm:ss'
  • 'd/M/yyyy hh:mm:ss a'
  • 'dd/MM/yyyy hh:mm:ss a'
  • 'yyyy-MM-dd HH:mm:ss.S:'
  • 'yyyy-MM-dd-HH:mm:ss'

Example Date Formats

In this example, the 4th of July 1990 is represented in a source dataset in the following ways. The syntax to use to explicitly specify each format follows for each one.

DataFormat
07/04/1990MM/dd/yyyy
07-04-1990MM-dd-yyyy
7/4/90M/d/yy
4 JUL 90d MMM yy

Working with Dates in ISO 8601 Format

Other functions can act on strings that are already in ISO 8601 format. These are:

iso_datetime_add

Add or subtract time from a date. Valid units are: SEC (seconds), MIN (minutes), HRS (hours), DAY (days), WKS (weeks), YRS (years), and MTH (months).

Note: Durations of months and years must be whole numbers, since months and years, such as leap years, can vary in length.

For example, to add four hours to a date:

iso_datetime_add(<date_attribute>, 4.0, 'hrs') AS <date_attribute>

To get the day before:

iso_datetime_add(<date_attribute>, -1.0, 'day') AS <date_attribute>

iso_difference_days

Get the number of days between two dates in ISO 8601 format. Returns a negative number if <date1_attribute> is the later date of the two.

iso_difference_days(<date1_attribute>,<date2_attribute>) AS <duration_attribute>

duration_to_iso

Convert a number to ISO 8601 duration format. Valid units are SEC (seconds), MIN (minutes), HRS (hours), DAY (days), and WKS (weeks).

duration_to_iso(<duration_attribute>, 'DAY', true) AS <duration_attribute>

dayofmonth

Calaculate the day of month (1 to 31) of a date in ISO 8601 format.

dayofmonth(<date_attribute>) AS <date_attribute>

Dealing with Two-Digit Years

A known issue is that the dayofmonth function inherits an issue from Java 8: any two-digit year passed to the function is treated as a year in the range 2000-2099. If your data includes dates prior to 2000, Tamr recommends this workaround.

In the following example, Date is the name of the column or attribute that has the date string being converted.

  1. Convert the Date to the ISO 8601 standard using the function datetime_to_iso.
  2. Next, use a case expression to detect any year into the future, such as 2080, and if the year is greater than 2030, convert it to a year in the past, or a 19XX year, such as 1980.

In this example, the year 2030 is used as as a cutoff point, but you can change it to another year. You can also use the current year as the cutoff point by using the function runtime_date.

// This SELECT converts Date into the ISO standard. You need to match the array formats to match your input data.
SELECT *, datetime_to_iso(Date, array('dd-MMM-yy')) AS my_iso_date;

// This CASE expression checks the year of my_iso_date and converts from a 20xx year to a 19xx year if the year is greater than 2030. 
// The year 2030 is an example and should be updated for your use case. 
SELECT *,
CASE WHEN year(my_iso_date) > 2030
THEN substr2(iso_datetime_add(my_iso_date || 'T:00:00', -100.0, 'YRS'), 0, -9)
ELSE my_iso_date
END AS my_iso_date;

Working with Unix Time

In cases where a timestamp is in the form of seconds (or milliseconds) from epoch, you can use the iso_datetime_add function to convert to ISO format. For example, if your data contains a timestamp in seconds from epoch in a date column you can use the following transformation to convert it to ISO format and place it in the iso_date column:

SELECT *, iso_datetime_add('1970-01-01T00:00:00', date, 'sec') as iso_date;

This example assumes a timezone of UTC for the input date.

See Unix time.