victorx
victorx

Reputation: 3559

AWS Cloudwatch Log - Is it possible to export existing log data from it?

I have managed to push my application logs to AWS Cloudwatch by using the AWS CloudWatch log agent. But the CloudWatch web console does not seem to provide a button to allow you to download/export the log data from it.

Any idea how I can achieve this goal?

Upvotes: 105

Views: 136940

Answers (16)

Fred
Fred

Reputation: 11

mainly based on previous response, but with parameters and fix on the time filter

#!/bin/bash


usage() {
  echo "Usage: $0" 1>&2
  echo " -l | --log-group : log-group name to extract" 1>&2
  echo " -d | --delay     : negative delay"  1>&2
  echo " -u | --unit unit : year|month|day|hour time unit to specify"  1>&2
  echo " -p | --profile   : profile aws to use" 1>&2 
  echo " -r | --region    : region to use" 1>&2
  echo " Extract aws cloudwatch log command : extract the log of the last [delay] [unit], example : -1 year " 1>&2
  exit 1
}

function dumpstreams() {
echo "Dumpstream"
echo "logs describe-log-streams"
  aws $AWSARGS logs describe-log-streams  \
    --order-by LastEventTime --descending --log-group-name $LOGGROUP \
    --output text       | while read -a st; 
    do 
     
      [ "${st[2]}" -lt "$starttime" ] && continue
      stname="${st[1]}"
     

      echo ${stname##*:}
    done | while read stream; do
 
       aws $AWSARGS logs get-log-events \
        --start-from-head --start-time $starttime \
        --log-group-name $LOGGROUP --log-stream-name "${stream}" --output table
     done
}

if [ $# -ne 10 ]; then
  echo "All parameters are required!"
  usage
  exit 1
fi

VALID_ARGS=$(getopt -o l:d:u:p:r: --long log-group:,delay:,unit:,profile:,region: -- "$@")
 if [[ $? -ne 0 ]]; then
    usage
    exit 1;
fi

units=('year' 'month' 'day' 'hour')
# while getopts "l:d:u:p:h" option; do
eval set -- "$VALID_ARGS"
nb_input_param=0
while [ : ]; do

  case "$1" in
    -l | --log-group )   
        nb_input_param=$(( $nb_input_param + 1))
        log_group="$2"
        echo "log-group :${log_group}"
        shift 2 
        ;;
        
    -d | --delay ) 
        nb_input_param=$(( $nb_input_param + 1))
        delay="$2"
        if [ "${delay}" -gt  -1  ] ; then
            usage
        fi  
        echo "delay :${delay}"
        shift 2
        ;;
       
    -u | --unit )
        nb_input_param=$(( $nb_input_param + 1))
        unit="$2"
         found=0
    
        for i in ${units[@]}
            do
                if [ $i = $unit ] ; then
                    found=1
                fi  
            done
        
        if [ "${found}" -eq "0" ] ; then
            usage
        fi  
        echo "unit :${unit}"
        shift 2 
        ;;
    -p| --profile ) 
        nb_input_param=$(( $nb_input_param + 1))
        profile="$2"
        echo "profile :${profile}"
        shift 2
        ;;

     
    -r | --region )
        nb_input_param=$(( $nb_input_param + 1))        
        region="$2"
        echo "region :${region}"
        shift 2
        ;;
    -h | --help )
         
        shift  
        ;;

    *) 
        echo $1
        if [ "${1}" != "--" ];then
                usage
        fi      
        shift; 
        break 
        ;;
        
 esac
done    
shift "$(($OPTIND -1))"
echo "fin"
AWSARGS="--profile ${profile} --region ${region}"
LOGGROUP="${log_group}"
 # TAIL=
starttime=$(date --date "${delay} ${unit}" +%s)000
 
nexttime=$(date +%s)000
 
dumpstreams
if [ -n "$TAIL" ]; then
  while true; do
    starttime=$nexttime
    nexttime=$(date +%s)000
    sleep 1
    dumpstreams
  done
fi

Upvotes: 0

samtoddler
samtoddler

Reputation: 9625

There are good answers here, just wanted to add what I found , kind of tricky too.I just wanted to download the entire log stream logs for one of the support case I was working on.

Below command works only on the latest log stream.

$ aws logs get-log-events --log-group-name  'my_log_group' \
--log-stream-name  'my_stream_name' --output text

If the log stream is not latest, we need to specify the --start-from-head flag otherwise it will not give you the logs

without --start-from-head flag

$ aws logs get-log-events --log-group-name  'my_log_group' \
--log-stream-name  'my_stream_name' --output text

b/3812585230811583357791156477815594204/s    f/38126173438735188692892143032919214/s

with --start-from-head flag


$ aws logs get-log-events --log-group-name  'my_log_group' \
--log-stream-name  'my_stream_name' --output text

b/3812585230811583357791156477815594204/s    f/3812617343873518869289214303291960224/s
EVENTS  1709568940700   INIT_START Runtime Version: python:3.12.v29    
Runtime Version ARN: arn:aws:lambda:us-east-1::runtime:2fb93380dac14772d30092f171a3f757

Below is the snippet from the aws cli docs

--start-from-head | --no-start-from-head (boolean)
If the value is true, the earliest log events are returned first. If
the value is false,
the latest log events are returned first. The default value is false.

If you are using a previous nextForwardToken value as the nextToken in this operation,
you must specify true for startFromHead .

Upvotes: 3

Raymond Chiu
Raymond Chiu

Reputation: 1064

I use windows and was inspired by Slavomir, i written a goLang program and built the awsLogExport.exe to call the aws log command to retrieve the from a time range.

Source Code is in following link: https://github.com/onionhead0708/awsExportLog/blob/main/awsExportLog.go

You can build the awsLogExport.exe with following command:

go build 

Following is an example to export the log messages to myLogfile.log for the log that started from 2024-01-13T14:25:00Z for 1 hour log:

awsExportLog.exe -r=us-west-2 -g=/aws/containerinsights/my-log-group -s=mytest.restapi -f=2024-01-13T14:25:00Z -d=1h > myLogfile.log

Arguments:

  -d string
        Duration of the log to be taken from the From time. e.g. 1m1s = 1 minute 1 second (default "1h")
  -f string
        From time in RFC3339 format. e.g.: 2024-02-13T14:25:60Z
  -g string
        AWS log group name
  -r string
        AWS region
  -s string
        AWS log stream name

Prerequisite:

  1. You need to install the AWS CLI - https://docs.aws.amazon.com/cli/latest/userguide/getting-started-install.html
  2. You need to logged in to your aws profile.

Note: awsExportLog.exe just call the AWS command line APIs: "aws logs get-log-events"

Upvotes: 0

Pavel Voropaev
Pavel Voropaev

Reputation: 63

For stream with more than 10000 events, it is more resilient to use the forward token and start-from-head for the log stream. It avoids issues with empty/duplicate events if time not exactly right or events are too dense.

#!/bin/bash

# Required parameters
REGION="PLACEHOLDER"
LOG_GROUP_NAME="PLACEHOLDER"
LOG_STREAM_NAME="PLACEHOLDER"

# Temporary files
TEMP_FILE="temp_output.txt"
FINAL_OUTPUT_FILE="final_output.txt"

# Initial command
aws logs get-log-events --region $REGION --log-group-name $LOG_GROUP_NAME --log-stream-name $LOG_STREAM_NAME --start-from-head --output text > $TEMP_FILE

# Check number of lines in the file
while [ $(wc -l <"$TEMP_FILE") -gt 1 ]
do
  # Get the next token
  NEXT_TOKEN=$(grep -o -E "^b/[0-9]+/s\s+f/[0-9]+/s$" $TEMP_FILE |awk '{print $2}')

  # If we didn't find a next token, exit the loop
  if [ -z "$NEXT_TOKEN" ]; then
    break
  fi

  # Remove token line from temp file and append to final output file
  grep -v -o -E "^b/[0-9]+/s\s+f/[0-9]+/s$" temp_output.txt  $TEMP_FILE >> $FINAL_OUTPUT_FILE
  
  # Fetch more logs with the next token
  aws logs get-log-events --region $REGION --log-group-name $LOG_GROUP_NAME --log-stream-name $LOG_STREAM_NAME --next-token $NEXT_TOKEN --output text > $TEMP_FILE
done

# Append remaining content in temp file to final output file
grep -v -o -E "^b/[0-9]+/s\s+f/[0-9]+/s$" temp_output.txt  $TEMP_FILE >> $FINAL_OUTPUT_FILE

# Cleanup
rm $TEMP_FILE

Upvotes: 1

Mike McKay
Mike McKay

Reputation: 2626

The other answers were not useful with AWS Lambda logs since they create many log streams and I just wanted to dump everything in the last week. I finally found the following command to be what I needed:

aws logs tail --since 1w  LOG_GROUP_NAME > output.log

Note that LOG_GROUP_NAME is the lambda function path (e.g. /aws/lambda/FUNCTION_NAME) and you can replace the since argument with a variety of times (1w = 1 week, 5m = 5 minutes, etc)

Upvotes: 24

SelftaughtMonk
SelftaughtMonk

Reputation: 1037

I had a similar use case where i had to download all the streams for a given log group. See if this script helps.

#!/bin/bash

if [[ "${#}" != 1 ]]
then
    echo "This script requires two arguments!"
    echo
    echo "Usage :"
    echo "${0} <log-group-name>"

    exit 1
fi

streams=`aws logs describe-log-streams --log-group-name "${1}"`


for stream in $(jq '.logStreams | keys | .[]' <<< "$streams"); do 
    record=$(jq -r ".logStreams[$stream]" <<< "$streams")
    streamName=$(jq -r ".logStreamName" <<< "$record")
    echo "Downloading ${streamName}";
    echo `aws logs get-log-events --log-group-name "${1}" --log-stream-name "$streamName" --output json > "${stream}.log" `
    echo "Completed dowload:: ${streamName}";
done;

You have have pass log group name as an argument.

Eg: bash <name_of_the_bash_file>.sh <group_name>

Upvotes: 0

ekeyser
ekeyser

Reputation: 605

export LOGGROUPNAME=[SOME_LOG_GROUP_NAME]; for LOGSTREAM in `aws --output text logs describe-log-streams --log-group-name ${LOGGROUPNAME} |awk '{print $7}'`; do aws --output text logs get-log-events --log-group-name ${LOGGROUPNAME} --log-stream-name ${LOGSTREAM} >> ${LOGGROUPNAME}_output.txt; done

Upvotes: 1

Slavomir
Slavomir

Reputation: 541

Inspired by saputkin I have created a pyton script that downloads all the logs for a log group in given time period.

The script itself: https://github.com/slavogri/aws-logs-downloader.git

In case there are multiple log streams for that period multiple files will be created. Downloaded files will be stored in current directory, and will be named by the log streams that has a log events in given time period. (If the group name contains forward slashes, they will be replaced by underscores. Each file will be overwritten if it already exists.)

Prerequisite: You need to be logged in to your aws profile. The Script itself is going to use on behalf of you the AWS command line APIs: "aws logs describe-log-streams" and "aws logs get-log-events"

Usage example: python aws-logs-downloader -g /ecs/my-cluster-test-my-app -t "2021-09-04 05:59:50 +00:00" -i 60

optional arguments:
   -h, --help         show this help message and exit
   -v, --version      show program's version number and exit
   -g , --log-group   (required) Log group name for which the log stream events needs to be downloaded
   -t , --end-time    (default: now) End date and time of the downloaded logs in format: %Y-%m-%d %H:%M:%S %z (example: 2021-09-04 05:59:50 +00:00)
   -i , --interval    (default: 30) Time period in minutes before the end-time. This will be used to calculate the time since which the logs will be downloaded.
   -p , --profile     (default: dev) The aws profile that is logged in, and on behalf of which the logs will be downloaded.
   -r , --region      (default: eu-central-1) The aws region from which the logs will be downloaded.

Please let me now if it was useful to you. :)

After I did it I learned that there is another option using Boto3: https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/logs.html#CloudWatchLogs.Client.get_log_events

Still the command line API seems to me like a good option.

Upvotes: 4

Guss
Guss

Reputation: 32315

The latest AWS CLI has a CloudWatch Logs cli, that allows you to download the logs as JSON, text file or any other output supported by AWS CLI.

For example to get the first 1MB up to 10,000 log entries from the stream a in group A to a text file, run:

aws logs get-log-events \
   --log-group-name A --log-stream-name a \
   --output text > a.log

The command is currently limited to a response size of maximum 1MB (up to 10,000 records per request), and if you have more you need to implement your own page stepping mechanism using the --next-token parameter. I expect that in the future the CLI will also allow full dump in a single command.

Update

Here's a small Bash script to list events from all streams in a specific group, since a specified time:

#!/bin/bash
function dumpstreams() {
  aws $AWSARGS logs describe-log-streams \
    --order-by LastEventTime --log-group-name $LOGGROUP \
    --output text | while read -a st; do 
      [ "${st[4]}" -lt "$starttime" ] && continue
      stname="${st[1]}"
      echo ${stname##*:}
    done | while read stream; do
      aws $AWSARGS logs get-log-events \
        --start-from-head --start-time $starttime \
        --log-group-name $LOGGROUP --log-stream-name $stream --output text
    done
}

AWSARGS="--profile myprofile --region us-east-1"
LOGGROUP="some-log-group"
TAIL=
starttime=$(date --date "-1 week" +%s)000
nexttime=$(date +%s)000
dumpstreams
if [ -n "$TAIL" ]; then
  while true; do
    starttime=$nexttime
    nexttime=$(date +%s)000
    sleep 1
    dumpstreams
  done
fi

That last part, if you set TAIL will continue to fetch log events and will report newer events as they come in (with some expected delay).

Upvotes: 147

user2679290
user2679290

Reputation: 197

Adapted @Guyss answer to macOS. As I am not really a bash guy, had to use python, to convert dates to a human-readable form.

runaswslog -1w gets last week and so on

runawslog() { sh awslogs.sh $1 | grep "EVENTS" | python parselogline.py; }

awslogs.sh:

#!/bin/bash
#set -x
function dumpstreams() {
  aws $AWSARGS logs describe-log-streams \
    --order-by LastEventTime --log-group-name $LOGGROUP \
    --output text | while read -a st; do 
      [ "${st[4]}" -lt "$starttime" ] && continue
      stname="${st[1]}"
      echo ${stname##*:}
    done | while read stream; do
      aws $AWSARGS logs get-log-events \
        --start-from-head --start-time $starttime \
        --log-group-name $LOGGROUP --log-stream-name $stream --output text
    done
}
AWSARGS=""
#AWSARGS="--profile myprofile --region us-east-1"
LOGGROUP="/aws/lambda/StockTrackFunc"
TAIL=
FROMDAT=$1
starttime=$(date -v ${FROMDAT} +%s)000
nexttime=$(date +%s)000
dumpstreams
if [ -n "$TAIL" ]; then
  while true; do
    starttime=$nexttime
    nexttime=$(date +%s)000
    sleep 1
    dumpstreams
  done
fi

parselogline.py:

import sys
import datetime
dat=sys.stdin.read()
for k in dat.split('\n'):
    d=k.split('\t')
    if len(d)<3:
        continue
    d[2]='\t'.join(d[2:])
    print( str(datetime.datetime.fromtimestamp(int(d[1])/1000)) + '\t' + d[2] )

Upvotes: 0

Chaitanya Bapat
Chaitanya Bapat

Reputation: 4029

I found AWS Documentation to be complete and accurate. https://docs.aws.amazon.com/AmazonCloudWatch/latest/logs/S3ExportTasks.html This laid down steps for exporting logs from Cloudwatch to S3

Upvotes: -1

prophoto
prophoto

Reputation: 362

You can now perform exports via the Cloudwatch Management Console with the new Cloudwatch Logs Insights page. Full documentation here https://docs.aws.amazon.com/AmazonCloudWatch/latest/logs/CWL_ExportQueryResults.html. I had already started ingesting my Apache logs into Cloudwatch with JSON, so YMMV if you haven't set it up in advance.

Add Query to Dashboard or Export Query Results

After you run a query, you can add the query to a CloudWatch dashboard, or copy the results to the clipboard.

Queries added to dashboards automatically re-run every time you load the dashboard and every time that the dashboard refreshes. These queries count toward your limit of four concurrent CloudWatch Logs Insights queries.

To add query results to a dashboard

Open the CloudWatch console at https://console.aws.amazon.com/cloudwatch/.

In the navigation pane, choose Insights.

Choose one or more log groups and run a query.

Choose Add to dashboard.

Select the dashboard, or choose Create new to create a new dashboard for the query results.

Choose Add to dashboard.

To copy query results to the clipboard

Open the CloudWatch console at https://console.aws.amazon.com/cloudwatch/.

In the navigation pane, choose Insights.

Choose one or more log groups and run a query.

Choose Actions, Copy query results.

Upvotes: 3

Johnride
Johnride

Reputation: 8736

I would add that one liner to get all logs for a stream :

aws logs get-log-events --log-group-name my-log-group --log-stream-name my-log-stream | grep '"message":' | awk -F '"' '{ print $(NF-1) }' > my-log-group_my-log-stream.txt

Or in a slightly more readable format :

aws logs get-log-events \
    --log-group-name my-log-group\
    --log-stream-name my-log-stream \
    | grep '"message":' \
    | awk -F '"' '{ print $(NF-1) }' \
    > my-log-group_my-log-stream.txt

And you can make a handy script out of it that is admittedly less powerful than @Guss's but simple enough. I saved it as getLogs.sh and invoke it with ./getLogs.sh log-group log-stream

#!/bin/bash

if [[ "${#}" != 2 ]]
then
    echo "This script requires two arguments!"
    echo
    echo "Usage :"
    echo "${0} <log-group-name> <log-stream-name>"
    echo
    echo "Example :"
    echo "${0} my-log-group my-log-stream"

    exit 1
fi

OUTPUT_FILE="${1}_${2}.log"
aws logs get-log-events \
    --log-group-name "${1}"\
    --log-stream-name "${2}" \
    | grep '"message":' \
    | awk -F '"' '{ print $(NF-1) }' \
    > "${OUTPUT_FILE}"

echo "Logs stored in ${OUTPUT_FILE}"

Upvotes: 6

Josh Vickery
Josh Vickery

Reputation: 922

It seems AWS has added the ability to export an entire log group to S3.

Export to S3 menu

Export to S3 Form

You'll need to setup permissions on the S3 bucket to allow cloudwatch to write to the bucket by adding the following to your bucket policy, replacing the region with your region and the bucket name with your bucket name.

    {
        "Effect": "Allow",
        "Principal": {
            "Service": "logs.us-east-1.amazonaws.com"
        },
        "Action": "s3:GetBucketAcl",
        "Resource": "arn:aws:s3:::tsf-log-data"
    },
    {
        "Effect": "Allow",
        "Principal": {
            "Service": "logs.us-east-1.amazonaws.com"
        },
        "Action": "s3:PutObject",
        "Resource": "arn:aws:s3:::tsf-log-data/*",
        "Condition": {
            "StringEquals": {
                "s3:x-amz-acl": "bucket-owner-full-control"
            }
        }
    }

Details can be found in Step 2 of this AWS doc

Upvotes: 30

Jan Vlcinsky
Jan Vlcinsky

Reputation: 44102

There is also a python project called awslogs, allowing to get the logs: https://github.com/jorgebastida/awslogs

There are things like:

list log groups:

$ awslogs groups

list streams for given log group:

$ awslogs streams /var/log/syslog

get the log records from all streams:

$ awslogs get /var/log/syslog

get the log records from specific stream :

$ awslogs get /var/log/syslog stream_A

and much more (filtering for time period, watching log streams...

I think, this tool might help you to do what you want.

Upvotes: 51

Naveen Vijay
Naveen Vijay

Reputation: 16482

Apparently there isn't an out-of-box way from AWS Console where you can download the CloudWatchLogs. Perhaps you can write a script to perform the CloudWatchLogs fetch using the SDK / API.

The good thing about CloudWatchLogs is that you can retain the logs for infinite time(Never Expire); unlike the CloudWatch which just keeps the logs for just 14 days. Which means you can run the script in monthly / quarterly frequency rather than on-demand.

More information about the CloudWatchLogs API, http://docs.aws.amazon.com/AmazonCloudWatchLogs/latest/APIReference/Welcome.html http://awsdocs.s3.amazonaws.com/cloudwatchlogs/latest/cwl-api.pdf

Upvotes: 3

Related Questions