bed_intersect: R Equivalent to Bedtools Intersect
bedtools intersect
is a useful tool in bioinformatics for finding the overlaps between genomic intervals from the BED file.
bedtools intersect
is a part of bedtools utilities which are developed to run as a command-line on a UNIX environment and could not be run on R software.
However, you can use bed_intersect
function from the R valr package for performing bedtools intersect
analysis in R.
The basic syntax for bed_intersect
function is:
# install.packages("valr")
# load package
library("valr")
bed_intersect(file1.bed, file2.bed)
Where, file1.bed
and file2.bed
contains the genomic intervals in BED format.
The following example explains how to find the overlapping intervals in R using bed_intersect
function (similar to bedtools intersect
).
Create two example genomic intervals to find the overlap,
# first BED file
bed1 <- tibble::tribble(
~chrom, ~start, ~end,
"chr1", 100, 500,
"chr1", 600, 700
)
# second BED file
bed2 <- tibble::tribble(
~chrom, ~start, ~end,
"chr1", 300, 500,
"chr1", 800, 900
)
We have created two genomic intervals in BED format using the tibble. Additionally, you can also import the data from the BED files.
Now, we will perform the bedtools intersect
analysis using bed_intersect
function to find the overlapping intervals between bed1
and bed2
genomic intervals.
# install.packages("valr")
# load package
library("valr")
bed_intersect(bed1, bed2)
# output
# A tibble: 1 × 6
chrom start.x end.x start.y end.y .overlap
<chr> <dbl> <dbl> <dbl> <dbl> <int>
1 chr1 100 500 300 500 200
You can see that there is one overlapping interval between the two intervals. There is a 200 bp overlap between the two intervals.
In addition, you can also visualize the overlap between the two genomic intervals using the bed_glyph
function.
# load package
library("valr")
bed_glyph(bed_intersect(bed1, bed2))
The additional parameters can also be used with bed_intersect
as described here.