TransWikia.com

What exactly is "chromosome topology"?

Biology Asked on April 22, 2021

I’ve been reading a lot about Hi-C lately, and this has been bothering me. So far as I can tell from reading around, the topology is related to the conformation of the linear chromosome. This seems like a fairly pointless description however, as they should all have the same topology. A "conformation", in my understanding, is the chromosome being folded into a particular geometry.

On the other hand though, I guess if two genes were brought together to form a loop, they would also form a "hole" (the gap in the loop). But this seems like a weak connection to make, just because two chromosome regions are paired doesn’t necessarily mean they were bound and formed a loop. It seems the "topology" would "best" be seen after a 3D reconstruction from the pairs data, showing the predicted geometry, and from there you could see how many putative loops and such there are, and thus start to determine the topological properties.

The last possibility I was thinking was that this is in reference to network topology implied by a contact matrix. This seems the most likely to me, but also the least implied (to me, at least) in the literature I’ve ever read.

All of this has me wondering if I’ve missed something about the meaning of "topology" here, as opposed to just "conformation". I’m wondering if there is a consensus on the precise meaning of topology in the genomics/Hi-C context.

2 Answers

I think that there are a few things going on in this question, I'm going to try to answer the ones that I think are most pertinent:

  1. We should not expect "topology" as used in the Hi-C field to have a rigorous mathematical interpretation. Most of the people thinking about it are ultimately biologists who are interested in biological problems. As suggested in the comments, they are basically interested in the shape of DNA in the cell; e.g. which pieces of DNA are close together and which are far away. You can use this information for a lot of different purposes of course, but most people using the technique are satisfied with viewing topology as "the set of pairwise distances between all DNA chunks" (and whatever things you can infer from that set).

  2. Your sense (from the comment) that the linear chromosome ordering is the main interesting feature of topology is (IMO) absolutely correct. However, it is not a given that every chromosome is put together the same way. This is related to the most useful application of Hi-C data that I know of, which is to either (A) scaffold genomes (put discontinuous sequences into the right order) or (B) deconvolute mixed populations of genomes into groups of sequences that belong to the same cell. The linear order of DNA is by at least an order of magnitude the dominant signal in the data, and is therefore the most trustworthy thing in there.

  3. Network topology is sometimes implicit but generally central as a feature of algorithms using Hi-C data. I don't know that we can say that all workers in the field are thinking of this when they talk about topology, but I certainly am. Some random papers that make it fairly obvious: here, here, here.

(Full disclosure: I work for a company that sells Hi-C kits and services.)

Correct answer by Maximilian Press on April 22, 2021

I think I found a somewhat satisfying solution here, that explains the original use of "topologically associated domain". I'm not sure if this was the original intent of the semantics, but it works kind of, enough to make the use of topology seem more reasonable.

Topology is, according to Wiki, "concerned with the properties of a geometric object that are preserved under continuous deformations, such as stretching, twisting, crumpling and bending, but not tearing or gluing."

In population Hi-C at least, Hi-C emphasizes/shows the high frequency contacts. As an example, imagine these as contacts that form a loop (there are other conformations, but this is the simplest case). The particular loop geometry can vary, but the contact that Hi-C can measure won't vary. Think about a tied shoe - the shoe can move, the loop of a knot itself can have all sorts of various geometries (imagine squishing it, folding the loop with your hands, etc.). Regardless, the knot will stay the same; the "string loci" that are adjacent in the knot will always be adjacent, which is an observable that relays that the topology is probably staying the same, ie a "loop" (not to get into knots and stuff, just considering strings touching each other as creating a hole, ie the loop).

Thus, the contact matrices are almost a histogram of the observables which relay topological features, at least for population Hi-C.

For single-cell Hi-C, because we don't have multiple samples, the contact matrix is not a histogram. Regardless, you can think of it as one term in a histogram, and thus still part of this indirectly topological description.

Let me know if this is unclear.

Answered by Chris on April 22, 2021

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP