About the tool:
Duplicate Invoice Identifier is an MS Access based tool which helps to identify duplicates from any Excel based data. The tool supports up to 10 conditions and 25 types of matching conditions to find the exact duplicate. You can also define formatting conditions to first format the data before checking for duplicates.
- Support two datasets (Current and Historic) where each record of current dataset is compared with all records in historic dataset
- Allows two ways to import data in the tool
Manual Copy and Paste
Import from Excel file
- It has got user friendly options to define condition for duplicates
- The tool supports 25 types of match conditions
- You can also define formatting conditions to first format the data before checking for duplicates
- Save Money
- Save Time
- Increase Accuracy
- MS Access 2016 or above version
- MS Excel 2016 or above version
- Windows 7 or above operating system
How to use this tool:
- Open the tool in MS Access 2007 or above version
- You may see a warning message on top because the file contains VBA Codes, click on Enable Content
- Double click on ‘Home’ form to open the tool
- You will see a blank form opened like below
- To use this tool for analysis, you need two datasets in Excel files
Data 1 (Current Data): This is the data in which you want to identify duplicates
Data 2 (Historic Data): This is the data from which you want to compare current data to identify duplicates
Points to Note:
- Both Current and Historic dataset should be in same format like sequence of columns
- You can import only 10 columns from the dataset in the tool
- Tool can read only 255 characters of each cell
See below a sample dataset:
- To import data in the tool, click on ‘Manage Data’ button
- There are two ways you can import data in the toolOption 1 (Manual Copy Paste): You can simply copy your data (without headers) from any Excel file and paste in the tool
Option 2 (Import from Excel): You can use import functionality in the tool to browse an Excel file and import data
- Once the data is imported, you need to define datatype of each column. By default, each column is considered as text, you need to explicitly change the datatype. It is an important step because you can define few conditions on specific datatype only. To define the datatype, select the right option for each column
- Now it’s time to configure the tool to identify duplicates. Since there are different ways a duplicate invoice can be processed; hence this tool comes with fully configurable conditions to catch the duplicate.
For an example, let’s say that you want to identify duplicates where Invoice numbers looks similar, invoice date is +- 5 days, Vendor name is similar, Amount is +-1 dollars and Customer Name is same. Have a look at the below screenshot of such duplicate:
- This tool comes with 25 types of matches, have a look at below table which can help you to decide the right match type to be selected.
- Let’s start with configuring the tool to identify duplicates, first we will define Invoice Number condition as ‘Character Match [>70%]’. You can also choose other character match options depends on how much variation you are expecting in the data. As you decrease the character match percentage, you are expected to get more duplicates
- In some cases where you want to remove special characters such as !@#$%^&*() before comparing the data, you can use Formatting option
- For Text datatype, you can use ‘Remove Special Characters’ formatting option. For Number datatype, you can use ‘Remove decimal values’ and ‘Convert number to absolute’ options
- Now we will define condition for Invoice Date as below. You can also choose other options as appropriate
- We will define condition for Vendor Name as ‘Left Match [>60%]’
- Next is Amount condition, for this we will define the condition as ‘Amount [+-1]’
- The last condition we will define for Customer Name as ‘Exact Match’
- Done, let’s click on ‘Analyze Data’ button and see the result.
Note that if you want to stop the analysis in between then you can click on the same button again. Also, you can see the progress on the bottom progress bar and percentage label
- Once Analysis are completed, you will see a confirmation message box along with number of duplicates found. Click on ‘OK’ button to proceed.
- To view the results in Excel file, click on ‘Export Report to Excel’ button
- Report will be divided in two sections:Section 1 – Current Data: These are the records from Current Data which are found as duplicate when comparing with Historic Data. You can identify them from Column A (Record Type) as ‘Current Data’. Also, these records will be marked in Orange color for easy identification
Section 2 – Historic Data: These are the matching records from Historic Data based on which duplicates have been identified in Current Data. You can identify them from Column A (Record Type) as ‘Historic Data’.
- Let’s have a look at a report with few more records
In the above screenshot, you can notice that there are 5 duplicates found in Current Data and there are 7 matching records from Historic Data. Each match has been given a Match Number which you can see in column B (MatchNumber). So, if you want to have a look at matching records of first duplicate then you can apply filter in Colum B as 1
Similarly, to look at matching records of fifth duplicate then apply filter in Column B as 5
- Great news, now you are ready to use the tool and save your business from duplicate payments.