icon/x Created with Sketch.

Splunk Cookie Policy

We use our own and third-party cookies to provide you with a great online experience. We also use these cookies to improve our products and services, support our marketing campaigns, and advertise to you on our website and other websites. Some cookies may continue to collect information after you have left our website. Learn more (including how to update your settings) here.
Accept Cookie Policy

We are working on something new...

A Fresh New Splunkbase
We are designing a New Splunkbase to improve search and discoverability of apps. Check out our new and improved features like Categories and Collections. New Splunkbase is currently in preview mode, as it is under active development. We welcome you to navigate New Splunkbase and give us feedback.

Accept License Agreements

This app is provided by a third party and your right to use the app is in accordance with the license provided by that third-party licensor. Splunk is not responsible for any third-party apps and does not provide any warranty or support. If you have any questions, complaints or claims with respect to this app, please contact the licensor directly.

Thank You

Downloading Jaccard Similarity Splunk Command
SHA256 checksum (jaccard-similarity-splunk-command_120.tgz) c2593f3136673b0e6e02c2cab4dace6478a7e8f33b4d89417fd415712217616c SHA256 checksum (jaccard-similarity-splunk-command_110.tgz) 8b7f8817974f373b017669dec03d77bbd364a992a1d2240641157a87cf33c2df SHA256 checksum (jaccard-similarity-splunk-command_100.tgz) 6f07042605b4a9c21bb341a67ca73cedb1f08896b14d47db994443a322b49965
To install your download
To install apps and add-ons from within Splunk Enterprise
  1. Log into Splunk Enterprise.
  2. On the Apps menu, click Manage Apps.
  3. Click Install app from file.
  4. In the Upload app window, click Choose File.
  5. Locate the .tar.gz file you just downloaded, and then click Open or Choose.
  6. Click Upload.
  7. Click Restart Splunk, and then confirm that you want to restart.
To install apps and add-ons directly into Splunk Enterprise
  1. Put the downloaded file in the $SPLUNK_HOME/etc/apps directory.
  2. Untar and ungzip your app or add-on, using a tool like tar -xvf (on *nix) or WinZip (on Windows).
  3. Restart Splunk.
After you install a Splunk app, you will find it on Splunk Home. If you have questions or need more information, see Manage app and add-on objects.

Flag As Inappropriate

splunk

Jaccard Similarity Splunk Command

This app is NOT supported by Splunk. Please read about what that means for you here.
Overview
# Jaccard Similarity Splunk Command
A custom Splunk calculates the mean Jaccard Similarity distance in all items in an MV field

## Sample Use Case
```
index=azure sourcetype=azure:aad:user
| dedup id
| eval proxy_addr = mvfilter(match('proxyAddresses{}', "(?i)smtp"))
| eval proxy_addr = mvmap(proxy_addr, replace(proxy_addr, "(?i)smtp:", ""))
| jaccard textfield="proxy_addr"
| where mvcount(proxy_addr) > 10 AND jaccard_distance_proxy_addr < 0.3
```

The SPL example above extracts the `proxyAddresses` field from the Azure Entra ID log source (`azure:aad:user`). This field is a multi-value (MV) field containing all possible SMTP email addresses associated with an Azure AD account.
The Jaccard Similarity Splunk Command calculates the average Jaccard Similarity score for all items in the field. It generates a new field named `jaccard_similarity_`, representing the similarity score.
A score close to 1.0 indicates that the items are highly similar.
A lower score, such as 0.25, suggests that the items are dissimilar.

### Explanation
The example query retrieves Azure AD accounts that meet the following conditions:
- They have more than 10 proxy addresses.
- Their Jaccard Similarity score is below 0.3.

This effectively filters out users with a large number of diverse proxy addresses while retaining accounts where most proxy addresses are highly similar. A common application is identifying "Shared Mailboxes", which often have numerous but closely related proxy addresses, such as:

- noreply1@mycompany.com
- noreply2@mycompany.com
- noreply3@mycompany.com

By excluding such cases, this approach helps refine user analysis within your organization.

## Requirements
I purposely excluded from this repository the `splunklib` for lightweightedness purposes. You should be able to get this library from Splunk's Github repository for [Splunk Python SDK](https://github.com/splunk/splunk-sdk-python])
Just simply copy the `splunklib` directory into this app's bin directory.

Release Notes

Version 1.2.0
March 28, 2025
  • Better algorithm
Version 1.1.0
March 28, 2025
  • Better algorithm
Version 1.0.0
March 26, 2025

A custom Splunk calculates the mean Jaccard Similarity distance in all items in an MV field


Subscribe Share

Are you a developer?

As a Splunkbase app developer, you will have access to all Splunk development resources and receive a 10GB license to build an app that will help solve use cases for customers all over the world. Splunkbase has 1000+ apps from Splunk, our partners and our community. Find an app for most any data source and user need, or simply create your own with help from our developer portal.

Follow Us:
Splunk, Splunk>,Turn Data Into Doing, Data-to-Everything, and D2E are trademarks or registered trademarks of Splunk LLC in the United States and other countries. All other brand names,product names,or trademarks belong to their respective owners.