How to Merge Regions in Multiple BED Files
In bioinformatics, you often need to merge the overlapping or book-ended genomic intervals into contiguous regions from the two or more BED files for genomic data analysis.
You can use various tools such as bedtools
and bedops
to merge two or more BED files.
Method 1: Using bedtools
If you have few BED Files:
cat file1.bed file2.bed | bedtools sort | bedtools merge > merged.bed
If you have many BED Files:
cat *.bed | bedtools sort | bedtools merge > merged.bed
Method 2: Using bedops
bedops --merge file1.bed file2.bed file3.bed > merged.bed
The following practical examples demonstrate how to use bedtools
and bedops
to merge multiple BED files.
Example 1: Using bedtools
bedtools
can be used for overlapping or book-ended genomic intervals from the multiple BED files.
For example, you have the following three bed files (file1.bed, file2.bed, and file3.bed)
cat file1.bed
chr1 10 100
chr1 400 500
cat file2.bed
chr1 50 200
chr1 600 700
cat file3.bed
chr1 100 200
chr1 150 300
chr1 500 600
Now, merge the overlapping or book-ended genomic intervals from these three BED file
cat file1.bed file2.bed file3.bed | bedtools sort | bedtools merge > merged.bed
cat merged.bed
chr1 10 300
chr1 400 700
In above example, we used cat
command to pipe the output to bedtools merge
to merge the overlapping genomic intervals.
The merged output is saved in the merged.bed
file.
bedtools merge
command requires a sorted BED file by chromosome and start position. Please read this article on how to sort BED file effectively.Example 2: Using bedops
bedops
can be used for merging overlapping genomic intervals into contiguous regions from the multiple BED files.
For example, you have the following three bed files (file1.bed, file2.bed, and file3.bed)
cat file1.bed
chr1 10 100
chr1 400 500
cat file2.bed
chr1 50 200
chr1 600 700
cat file3.bed
chr1 100 200
chr1 150 300
chr1 500 600
Now, merge the overlapping intervals into contiguous region from these three BED file
bedops --merge file1.bed file2.bed file3.bed > merged.bed
cat merged.bed
chr1 10 300
chr1 400 700
In above example, we used bedops --merge
command to merge the overlapping genomic intervals from three BED files into contiguous regions.
The merged output is saved in the merged.bed
file.