AWS Bedrock

Collect AWS Bedrock model invocation logs with Elastic Agent.


Version	0.1.2 (View all)
Compatible Kibana version(s)	8.12.0 or higher
Supported Serverless project types What's this?	Security Observability
Subscription level What's this?	Basic
Level of support What's this?	Elastic

The AWS Bedrock model invocation logs integration allows you to easily connect your Bedrock model invocation logging to Elastic for seamless collection of invocation logs to monitor usage. Elastic Security can leverage this data for security analytics including correlation, visualization and incident response. With invocation logging, you can collect the full request and response data, and any metadata associated with use of your account.

Compatibility

This integration is compatible with the AWS Bedrock ModelInvocationLog schema, version 1.0.

Data streams

The AWS Bedrock model invocation logs integration currently provides a single data stream of model invocation logs, aws_bedrock.invocation.

Requirements

Elastic Agent must be installed.
You can install only one Elastic Agent per host.
Elastic Agent is required to stream data from the S3 bucket and ship the data to Elastic, where the events will then be processed via the integration's ingest pipelines.

Installing and managing an Elastic Agent:

You have a few options for installing and managing an Elastic Agent:

Install a Fleet-managed Elastic Agent (recommended):

With this approach, you install Elastic Agent and use Fleet in Kibana to define, configure, and manage your agents in a central location. We recommend using Fleet management because it makes the management and upgrade of your agents considerably easier.

Install Elastic Agent in standalone mode (advanced users):

With this approach, you install Elastic Agent and manually configure the agent locally on the system where it is installed. You are responsible for managing and upgrading the agents. This approach is reserved for advanced users only.

Install Elastic Agent in a containerized environment:

You can run Elastic Agent inside a container, either with Fleet Server or standalone. Docker images for all versions of Elastic Agent are available from the Elastic Docker registry, and we provide deployment manifests for running on Kubernetes.

There are some minimum requirements for running Elastic Agent and for more information, refer to the link here.

The minimum kibana.version required is 8.12.0.

Setup

In order to use the AWS Bedrock model invocation logs, logging model invocation logging must be enabled and be sent to a log store destination, either S3 or CloudWatch. The full details of this are available from the AWS Bedrock User Guide, but outlined here.

Set up an Amazon S3 or CloudWatch Logs destination.
Enable logging. This can be done either through the AWS Bedrock console or the AWS Bedrock API.

Collecting Bedrock model invocation logs from S3 bucket

When collecting logs from S3 bucket is enabled, users can retrieve logs from S3 objects that are pointed to by S3 notification events read from an SQS queue or directly polling list of S3 objects in an S3 bucket.

The use of SQS notification is preferred: polling list of S3 objects is expensive in terms of performance and costs and should be preferably used only when no SQS notification can be attached to the S3 buckets. This input integration also supports S3 notification from SNS to SQS.

SQS notification method is enabled setting queue_url configuration value. S3 bucket list polling method is enabled setting bucket_arn configuration value and number_of_workers value. Both queue_url and bucket_arn cannot be set at the same time and at least one of the two value must be set.

Collecting Bedrock model invocation logs from CloudWatch

When collecting logs from CloudWatch is enabled, users can retrieve logs from all log streams in a specific log group. filterLogEvents AWS API is used to list log events from the specified log group.

Exported fields

Field	Description	Type
@timestamp	Date/time when the event originated. This is the date/time extracted from the event, typically representing when the event was generated by the source. If the event source has no original timestamp, this value is typically populated by the first time the event was received by the pipeline. Required field for all events.	date
aws.cloudwatch.message	CloudWatch log message.	text
aws.s3.bucket.arn	ARN of the S3 bucket that this log retrieved from.	keyword
aws.s3.bucket.name	Name of the S3 bucket that this log retrieved from.	keyword
aws.s3.object.key	Name of the S3 object that this log retrieved from.	keyword
aws_bedrock.invocation.artifacts		flattened
aws_bedrock.invocation.error		keyword
aws_bedrock.invocation.error_code		keyword
aws_bedrock.invocation.image_generation_config.cfg_scale		double
aws_bedrock.invocation.image_generation_config.height		long
aws_bedrock.invocation.image_generation_config.number_of_images		long
aws_bedrock.invocation.image_generation_config.quality		keyword
aws_bedrock.invocation.image_generation_config.seed		long
aws_bedrock.invocation.image_generation_config.width		long
aws_bedrock.invocation.image_variation_params.images		keyword
aws_bedrock.invocation.image_variation_params.text		keyword
aws_bedrock.invocation.images		keyword
aws_bedrock.invocation.input.input_body_json		flattened
aws_bedrock.invocation.input.input_body_json_massive_hash		keyword
aws_bedrock.invocation.input.input_body_json_massive_length		long
aws_bedrock.invocation.input.input_body_s3_path		keyword
aws_bedrock.invocation.input.input_content_type		keyword
aws_bedrock.invocation.input.input_token_count	todo	long
aws_bedrock.invocation.model_id		keyword
aws_bedrock.invocation.output.output_body_json		flattened
aws_bedrock.invocation.output.output_body_s3_path		keyword
aws_bedrock.invocation.output.output_content_type		keyword
aws_bedrock.invocation.output.output_token_count		long
aws_bedrock.invocation.request_id		keyword
aws_bedrock.invocation.result		keyword
aws_bedrock.invocation.schema_type		keyword
aws_bedrock.invocation.schema_version		keyword
aws_bedrock.invocation.task_type		keyword
cloud.account.id	The cloud account or organization id used to identify different entities in a multi-tenant environment. Examples: AWS account id, Google Cloud ORG Id, or other unique identifier.	keyword
cloud.availability_zone	Availability zone in which this host, resource, or service is located.	keyword
cloud.image.id	Image ID for the cloud instance.	keyword
cloud.instance.id	Instance ID of the host machine.	keyword
cloud.instance.name	Instance name of the host machine.	keyword
cloud.machine.type	Machine type of the host machine.	keyword
cloud.project.id	The cloud project identifier. Examples: Google Cloud Project id, Azure Project id.	keyword
cloud.provider	Name of the cloud provider. Example values are aws, azure, gcp, or digitalocean.	keyword
cloud.region	Region in which this host, resource, or service is located.	keyword
container.id	Unique container id.	keyword
container.image.name	Name of the image the container was built on.	keyword
container.labels	Image labels.	object
container.name	Container name.	keyword
data_stream.dataset	The field can contain anything that makes sense to signify the source of the data. Examples include `nginx.access`, `prometheus`, `endpoint` etc. For data streams that otherwise fit, but that do not have dataset set we use the value "generic" for the dataset value. `event.dataset` should have the same value as `data_stream.dataset`. Beyond the Elasticsearch data stream naming criteria noted above, the `dataset` value has additional restrictions: * Must not contain `-` * No longer than 100 characters	constant_keyword
data_stream.namespace	A user defined namespace. Namespaces are useful to allow grouping of data. Many users already organize their indices this way, and the data stream naming scheme now provides this best practice as a default. Many users will populate this field with `default`. If no value is used, it falls back to `default`. Beyond the Elasticsearch index naming criteria noted above, `namespace` value has the additional restrictions: * Must not contain `-` * No longer than 100 characters	constant_keyword
data_stream.type	An overarching type for the data stream. Currently allowed values are "logs" and "metrics". We expect to also add "traces" and "synthetics" in the near future.	constant_keyword
ecs.version	ECS version this event conforms to. `ecs.version` is a required field and must exist in all events. When querying across multiple indices -- which may conform to slightly different ECS versions -- this field lets integrations adjust to the schema version of the events.	keyword
event.dataset	Event dataset	constant_keyword
event.module	Name of the module this data is coming from. If your monitoring agent supports the concept of modules or plugins to process events of a given source (e.g. Apache logs), `event.module` should contain the name of this module.	constant_keyword
event.original	Raw text message of entire event. Used to demonstrate log integrity or where the full log message (before splitting it up in multiple parts) may be required, e.g. for reindex. This field is not indexed and doc_values are disabled. It cannot be searched, but it can be retrieved from `_source`. If users wish to override this and index this field, please see `Field data types` in the `Elasticsearch Reference`.	keyword
gen_ai.analysis.action_recommended	Recommended actions based on the analysis.	keyword
gen_ai.analysis.findings	Detailed findings from security tools.	nested
gen_ai.analysis.function	Name of the security or analysis function used.	keyword
gen_ai.analysis.tool_names	Name of the security or analysis tools used.	keyword
gen_ai.completion	The full text of the LLM's response.	text
gen_ai.compliance.request_triggered	Lists compliance-related filters that were triggered during the processing of the request, such as data privacy filters or regulatory compliance checks.	keyword
gen_ai.compliance.response_triggered	Lists compliance-related filters that were triggered during the processing of the response, such as data privacy filters or regulatory compliance checks.	keyword
gen_ai.compliance.violation_code	Code identifying the specific compliance rule that was violated.	keyword
gen_ai.compliance.violation_detected	Indicates if any compliance violation was detected during the interaction.	boolean
gen_ai.owasp.description	Description of the OWASP risk triggered.	text
gen_ai.owasp.id	Identifier for the OWASP risk addressed.	keyword
gen_ai.performance.request_size	Size of the request payload in bytes.	long
gen_ai.performance.response_size	Size of the response payload in bytes.	long
gen_ai.performance.response_time	Time taken by the LLM to generate a response in milliseconds.	long
gen_ai.performance.start_response_time	Time taken by the LLM to send first response byte in milliseconds.	long
gen_ai.policy.action	Action taken due to a policy violation, such as blocking, alerting, or modifying the content.	keyword
gen_ai.policy.confidence	Confidence level in the policy match that triggered the action, quantifying how closely the identified content matched the policy criteria.	keyword
gen_ai.policy.match_detail.*		object
gen_ai.policy.name	Name of the specific policy that was triggered.	keyword
gen_ai.policy.violation	Specifies if a security policy was violated.	boolean
gen_ai.prompt	The full text of the user's request to the gen_ai.	text
gen_ai.request.id	Unique identifier for the LLM request.	keyword
gen_ai.request.max_tokens	Maximum number of tokens the LLM generates for a request.	integer
gen_ai.request.model.description	Description of the LLM model.	keyword
gen_ai.request.model.id	Unique identifier for the LLM model.	keyword
gen_ai.request.model.instructions	Custom instructions for the LLM model.	text
gen_ai.request.model.role	Role of the LLM model in the interaction.	keyword
gen_ai.request.model.type	Type of LLM model.	keyword
gen_ai.request.model.version	Version of the LLM model used to generate the response.	keyword
gen_ai.request.temperature	Temperature setting for the LLM request.	float
gen_ai.request.timestamp	Timestamp when the request was made.	date
gen_ai.request.top_k	The top_k sampling setting for the LLM request.	float
gen_ai.request.top_p	The top_p sampling setting for the LLM request.	float
gen_ai.response.error_code	Error code returned in the LLM response.	keyword
gen_ai.response.finish_reasons	Reason the LLM response stopped.	keyword
gen_ai.response.id	Unique identifier for the LLM response.	keyword
gen_ai.response.model	Name of the LLM a response was generated from.	keyword
gen_ai.response.timestamp	Timestamp when the response was received.	date
gen_ai.security.hallucination_consistency	Consistency check between multiple responses.	float
gen_ai.security.jailbreak_score	Measures similarity to known jailbreak attempts.	float
gen_ai.security.prompt_injection_score	Measures similarity to known prompt injection attacks.	float
gen_ai.security.refusal_score	Measures similarity to known LLM refusal responses.	float
gen_ai.security.regex_pattern_count	Counts occurrences of strings matching user-defined regex patterns.	integer
gen_ai.sentiment.content_categories	Categories of content identified as sensitive or requiring moderation.	keyword
gen_ai.sentiment.content_inappropriate	Whether the content was flagged as inappropriate or sensitive.	boolean
gen_ai.sentiment.score	Sentiment analysis score.	float
gen_ai.sentiment.toxicity_score	Toxicity analysis score.	float
gen_ai.system	Name of the LLM foundation model vendor.	keyword
gen_ai.text.complexity_score	Evaluates the complexity of the text.	float
gen_ai.text.readability_score	Measures the readability level of the text.	float
gen_ai.text.similarity_score	Measures the similarity between the prompt and response.	float
gen_ai.threat.action	Recommended action to mitigate the detected security threat.	keyword
gen_ai.threat.category	Category of the detected security threat.	keyword
gen_ai.threat.description	Description of the detected security threat.	text
gen_ai.threat.detected	Whether a security threat was detected.	boolean
gen_ai.threat.risk_score	Numerical score indicating the potential risk associated with the response.	float
gen_ai.threat.signature	Signature of the detected security threat.	keyword
gen_ai.threat.source	Source of the detected security threat.	keyword
gen_ai.threat.type	Type of threat detected in the LLM interaction.	keyword
gen_ai.threat.yara_matches	Stores results from YARA scans including rule matches and categories.	nested
gen_ai.usage.completion_tokens	Number of tokens in the LLM's response.	integer
gen_ai.usage.prompt_tokens	Number of tokens in the user's request.	integer
gen_ai.user.id	Unique identifier for the user.	keyword
gen_ai.user.rn	Unique resource name for the user.	keyword
host.architecture	Operating system architecture.	keyword
host.containerized	If the host is a container.	boolean
host.domain	Name of the domain of which the host is a member. For example, on Windows this could be the host's Active Directory domain or NetBIOS domain name. For Linux this could be the domain of the host's LDAP provider.	keyword
host.hostname	Hostname of the host. It normally contains what the `hostname` command returns on the host machine.	keyword
host.id	Unique host id. As hostname is not always unique, use values that are meaningful in your environment. Example: The current usage of `beat.name`.	keyword
host.ip	Host ip addresses.	ip
host.mac	Host MAC addresses. The notation format from RFC 7042 is suggested: Each octet (that is, 8-bit byte) is represented by two [uppercase] hexadecimal digits giving the value of the octet as an unsigned integer. Successive octets are separated by a hyphen.	keyword
host.name	Name of the host. It can contain what hostname returns on Unix systems, the fully qualified domain name (FQDN), or a name specified by the user. The recommended value is the lowercase FQDN of the host.	keyword
host.os.build	OS build information.	keyword
host.os.codename	OS codename, if any.	keyword
host.os.family	OS family (such as redhat, debian, freebsd, windows).	keyword
host.os.kernel	Operating system kernel version as a raw string.	keyword
host.os.name	Operating system name, without the version.	keyword
host.os.name.text	Multi-field of `host.os.name`.	match_only_text
host.os.platform	Operating system platform (such centos, ubuntu, windows).	keyword
host.os.version	Operating system version as a raw string.	keyword
host.type	Type of host. For Cloud providers this can be the machine type like `t2.medium`. If vm, this could be the container, for example, or other information meaningful in your environment.	keyword
input.type	Type of Filebeat input.	keyword
log.file.path	Full path to the log file this event came from, including the file name. It should include the drive letter, when appropriate. If the event wasn't read from a log file, do not populate this field.	keyword
log.offset	Log offset	long
message	For log events the message field contains the log message, optimized for viewing in a log viewer. For structured logs without an original message field, other fields can be concatenated to form a human-readable summary of the event. If multiple messages exist, they can be combined into one message.	match_only_text
tags	List of keywords used to tag each event.	keyword
user.id	Unique identifier of the user.	keyword

Changelog

Version	Details	Kibana version(s)
0.1.2	Bug fix View pull request Add documentation image.	—
0.1.1	Bug fix View pull request Fix documentation markdown.	—
0.1.0	Enhancement View pull request Initial build.	—

On this page

Article navigation