How to Merge Regions in Multiple BED Files
In bioinformatics, you often need to merge the overlapping or book-ended genomic intervals into contiguous regions from the two or more BED files for genomic data analysis.
You can use various tools such as bedtools and bedops to merge two or more BED files.
Method 1: Using bedtools
If you have few BED Files:
cat file1.bed file2.bed | bedtools sort | bedtools merge > merged.bed
If you have many BED Files:
cat *.bed | bedtools sort | bedtools merge > merged.bed
Method 2: Using bedops
bedops --merge file1.bed file2.bed file3.bed > merged.bed
The following practical examples demonstrate how to use bedtools and bedops to merge multiple BED files.
Example 1: Using bedtools
bedtools can be used for overlapping or book-ended genomic intervals from the multiple BED files.
For example, you have the following three bed files (file1.bed, file2.bed, and file3.bed)
cat file1.bed
chr1 10 100
chr1 400 500
cat file2.bed
chr1 50 200
chr1 600 700
cat file3.bed
chr1 100 200
chr1 150 300
chr1 500 600
Now, merge the overlapping or book-ended genomic intervals from these three BED file
cat file1.bed file2.bed file3.bed | bedtools sort | bedtools merge > merged.bed
cat merged.bed
chr1 10 300
chr1 400 700
In above example, we used cat command to pipe the output to bedtools merge to merge the overlapping genomic intervals.
The merged output is saved in the merged.bed file.
bedtools merge command requires a sorted BED file by chromosome and start position. Please read this article on how to sort BED file effectively.Example 2: Using bedops
bedops can be used for merging overlapping genomic intervals into contiguous regions from the multiple BED files.
For example, you have the following three bed files (file1.bed, file2.bed, and file3.bed)
cat file1.bed
chr1 10 100
chr1 400 500
cat file2.bed
chr1 50 200
chr1 600 700
cat file3.bed
chr1 100 200
chr1 150 300
chr1 500 600
Now, merge the overlapping intervals into contiguous region from these three BED file
bedops --merge file1.bed file2.bed file3.bed > merged.bed
cat merged.bed
chr1 10 300
chr1 400 700
In above example, we used bedops --merge command to merge the overlapping genomic intervals from three BED files into contiguous regions.
The merged output is saved in the merged.bed file.