Jason
Jason

Reputation: 15358

Extract 4-digit year value from a string

I have a year listed in my string

$s = "Acquired by the University in 1988";

In practice, that could be anywhere in this single line string. How do I extract it using regex? I tried \d and that didn't work, it just came up with an error.

I'm using preg_match in LAMP 5.2

Upvotes: 4

Views: 5037

Answers (8)

Paul Alexander
Paul Alexander

Reputation: 32367

/(^|\s)(\d{4})(\s|$)/gm

Matches

Acquired by the University in 1988
The 1945 vintage was superb
1492 columbus sailed the ocean blue

Ignores

There were nearly 10000 people there!
Member ID 45678
Phone Number 951-555-2563

Upvotes: 1

Shaig Khaligli
Shaig Khaligli

Reputation: 5485

Also if your string is something like that :

$date = "20044Q";

You can use below code to extract year from any string.

preg_match('/(?:(?:19|20)[0-9]{2})/', $date, $matches);
echo $matches[0];

Upvotes: 0

intuited
intuited

Reputation: 24044

/(?<!\d)\d{4}(?!\d)/ will match only 4-digit numbers that do not have digits before or after them.

(?<!\d) and (?!\d) are look-behind and look-ahead (respectively) assertions that ensure that a \d does not occur before or after the main part of the RE.

It may in practice be more sensible to use \b instead of the assertions; this will ensure that the beginning and end of the year occur at a "word boundary". So then "1337hx0r" would be appropriately ignored.

If you are only for looking for years within the past century or so, you could use

/\b(19|20)\d{2}\b/

Upvotes: 0

ridgerunner
ridgerunner

Reputation: 34395

You need a regex to match four digits, and these four digits must comprise a whole word (i.e. a string of 10 digits contains four digits but is not a year.) Thus, the regex needs to include word boundaries like so:

if (preg_match('/\b\d{4}\b/', $s, $matches)) {
    $year = $matches[0];
}

Upvotes: 15

Ben Rowe
Ben Rowe

Reputation: 28711

For a basic year match, assuming only one year

$year = false;
if(preg_match("/\d{4}/", $string, $match)) {
  $year = $match[0];
}

If you need to handle the posibility of multiple years in the same string

if(preg_match_all("/\d{4}/", $string, $matches, PREG_SET_ORDER)) {
  foreach($matches as $match) {
    $year = $match[0];
  }
}

Upvotes: 0

Demian Brecht
Demian Brecht

Reputation: 21368

Well, you could use \d{4}, but that will break if there's anything else in the string with four digits.

Edit:

The problem is that, other than the four numeric characters, there isn't really any other identifying information (as, according to your requirements, the number can be anywhere in the string), so based on what you've written, this is probably the best that you can do outside of range checking the returned value.

$str = "the year is 1988";
preg_match('/\d{4}/', $str, $matches);

var_dump($matches);

Upvotes: 3

anubhava
anubhava

Reputation: 785236

Try this code:

<?php
  $s = "Acquired by the University in 1988 year.";
  $yr = preg_replace('/^[^\d]*(\d{4}).*$/', '\1', $s);
  var_dump($yr);
?>

OUTPUT:

string(4) "1988"

However this regex works with an assumption that 4 digit number appears just once in the line.

Upvotes: 2

k to the z
k to the z

Reputation: 3185

preg_match('/(\d{4})/', $string, $matches);

Upvotes: 0

Related Questions