From this table we can see that most items have some correlation with each other ranging from \(r=-0.382\) for Items 3 and 7 to \(r=.514\) for Items 6 and 7. Due to relatively high correlations among items, this would be a good candidate for factor analysis. Recall that the goal of factor analysis is to model the interrelationships between items with fewer (latent) variables. These interrelationships can be broken up into multiple components
Since the goal of factor analysis is to model the interrelationships among items, we focus primarily on the variance and covariance rather than the mean. Factor analysis assumes that variance can be partitioned into two types of variance, common and unique
The figure below shows how these concepts are related:
As a data analyst, the goal of a factor analysis is to reduce the number of variables to explain and to interpret the results. This can be accomplished in two steps:
Factor extraction involves making a choice about the type of model as well the number of factors to extract. Factor rotation comes after the factors are extracted, with the goal of achieving simple structure in order to improve interpretability.
There are two approaches to factor extraction which stems from different approaches to variance partitioning: a) principal components analysis and b) common factor analysis.
Unlike factor analysis, principal components analysis or PCA makes the assumption that there is no unique variance, the total variance is equal to common variance. Recall that variance can be partitioned into common and unique variance. If there is no unique variance then common variance takes up total variance (see figure below). Additionally, if the total variance is 1, then the common variance is equal to the communality.
The goal of a PCA is to replicate the correlation matrix using a set of components that are fewer in number and linear combinations of the original set of items. Although the following analysis defeats the purpose of doing a PCA we will begin by extracting as many components as possible as a teaching exercise and so that we can decide on the optimal number of components to extract later.
First go to Analyze – Dimension Reduction – Factor. Move all the observed variables over the Variables: box to be analyze.
Under Extraction – Method, pick Principal components and make sure to Analyze the Correlation matrix. We also request the Unrotated factor solution and the Scree plot. Under Extract, choose Fixed number of factors, and under Factor to extract enter 8. We also bumped up the Maximum Iterations of Convergence to 100.
The equivalent SPSS syntax is shown below:
Before we get into the SPSS output, let’s understand a few things about eigenvalues and eigenvectors.
Eigenvalues represent the total amount of variance that can be explained by a given principal component. They can be positive or negative in theory, but in practice they explain variance which is always positive.
Eigenvalues are also the sum of squared component loadings across all items for each component, which represent the amount of variance in each item that can be explained by the principal component.
Eigenvectors represent a weight for each eigenvalue. The eigenvector times the square root of the eigenvalue gives the component loadings which can be interpreted as the correlation of each item with the principal component. For this particular PCA of the SAQ-8, the eigenvector associated with Item 1 on the first component is \(0.377\), and the eigenvalue of Item 1 is \(3.057\). We can calculate the first component as
$$(0.377)\sqrt{3.057}= 0.659.$$
In this case, we can say that the correlation of the first item with the first component is \(0.659\). Let’s now move on to the component matrix.
The components can be interpreted as the correlation of each item with the component. Each item has a loading corresponding to each of the 8 components. For example, Item 1 is correlated \(0.659\) with the first component, \(0.136\) with the second component and \(-0.398\) with the third, and so on.
The square of each loading represents the proportion of variance (think of it as an \(R^2\) statistic) explained by a particular component. For Item 1, \((0.659)^2=0.434\) or \(43.4\%\) of its variance is explained by the first component. Subsequently, \((0.136)^2 = 0.018\) or \(1.8\%\) of the variance in Item 1 is explained by the second component. The total variance explained by both components is thus \(43.4\%+1.8\%=45.2\%\). If you keep going on adding the squared loadings cumulatively down the components, you find that it sums to 1 or 100%. This is also known as the communality , and in a PCA the communality for each item is equal to the total variance.
Component Matrix | ||||||||
Item | Component | |||||||
1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | |
1 | 0.659 | 0.136 | -0.398 | 0.160 | -0.064 | 0.568 | -0.177 | 0.068 |
2 | -0.300 | 0.866 | -0.025 | 0.092 | -0.290 | -0.170 | -0.193 | -0.001 |
3 | -0.653 | 0.409 | 0.081 | 0.064 | 0.410 | 0.254 | 0.378 | 0.142 |
4 | 0.720 | 0.119 | -0.192 | 0.064 | -0.288 | -0.089 | 0.563 | -0.137 |
5 | 0.650 | 0.096 | -0.215 | 0.460 | 0.443 | -0.326 | -0.092 | -0.010 |
6 | 0.572 | 0.185 | 0.675 | 0.031 | 0.107 | 0.176 | -0.058 | -0.369 |
7 | 0.718 | 0.044 | 0.453 | -0.006 | -0.090 | -0.051 | 0.025 | 0.516 |
8 | 0.568 | 0.267 | -0.221 | -0.694 | 0.258 | -0.084 | -0.043 | -0.012 |
Extraction Method: Principal Component Analysis. | ||||||||
a. 8 components extracted. |
Summing the squared component loadings across the components (columns) gives you the communality estimates for each item, and summing each squared loading down the items (rows) gives you the eigenvalue for each component. For example, to obtain the first eigenvalue we calculate:
$$(0.659)^2 + (-.300)^2 – (-0.653)^2 + (0.720)^2 + (0.650)^2 + (0.572)^2 + (0.718)^2 + (0.568)^2 = 3.057$$
You will get eight eigenvalues for eight components, which leads us to the next table.
Total Variance Explained in the 8-component PCA
Recall that the eigenvalue represents the total amount of variance that can be explained by a given principal component. Starting from the first component, each subsequent component is obtained from partialling out the previous component. Therefore the first component explains the most variance, and the last component explains the least. Looking at the Total Variance Explained table, you will get the total variance explained by each component. For example, Component 1 is \(3.057\), or \((3.057/8)\% = 38.21\%\) of the total variance. Because we extracted the same number of components as the number of items, the Initial Eigenvalues column is the same as the Extraction Sums of Squared Loadings column.
Total Variance Explained | ||||||
Component | Initial Eigenvalues | Extraction Sums of Squared Loadings | ||||
Total | % of Variance | Cumulative % | Total | % of Variance | Cumulative % | |
1 | 3.057 | 38.206 | 38.206 | 3.057 | 38.206 | 38.206 |
2 | 1.067 | 13.336 | 51.543 | 1.067 | 13.336 | 51.543 |
3 | 0.958 | 11.980 | 63.523 | 0.958 | 11.980 | 63.523 |
4 | 0.736 | 9.205 | 72.728 | 0.736 | 9.205 | 72.728 |
5 | 0.622 | 7.770 | 80.498 | 0.622 | 7.770 | 80.498 |
6 | 0.571 | 7.135 | 87.632 | 0.571 | 7.135 | 87.632 |
7 | 0.543 | 6.788 | 94.420 | 0.543 | 6.788 | 94.420 |
8 | 0.446 | 5.580 | 100.000 | 0.446 | 5.580 | 100.000 |
Extraction Method: Principal Component Analysis. |
Since the goal of running a PCA is to reduce our set of variables down, it would useful to have a criterion for selecting the optimal number of components that are of course smaller than the total number of items. One criterion is the choose components that have eigenvalues greater than 1. Under the Total Variance Explained table, we see the first two components have an eigenvalue greater than 1. This can be confirmed by the Scree Plot which plots the eigenvalue (total variance explained) by the component number. Recall that we checked the Scree Plot option under Extraction – Display, so the scree plot should be produced automatically.
The first component will always have the highest total variance and the last component will always have the least, but where do we see the largest drop? If you look at Component 2, you will see an “elbow” joint. This is the marking point where it’s perhaps not too beneficial to continue further component extraction. There are some conflicting definitions of the interpretation of the scree plot but some say to take the number of components to the left of the the “elbow”. Following this criteria we would pick only one component. A more subjective interpretation of the scree plots suggests that any number of components between 1 and 4 would be plausible and further corroborative evidence would be helpful.
Some criteria say that the total variance explained by all components should be between 70% to 80% variance, which in this case would mean about four to five components. The authors of the book say that this may be untenable for social science research where extracted factors usually explain only 50% to 60%. Picking the number of components is a bit of an art and requires input from the whole research team. Let’s suppose we talked to the principal investigator and she believes that the two component solution makes sense for the study, so we will proceed with the analysis.
Running the two component PCA is just as easy as running the 8 component solution. The only difference is under Fixed number of factors – Factors to extract you enter 2.
We will focus the differences in the output between the eight and two-component solution. Under Total Variance Explained, we see that the Initial Eigenvalues no longer equals the Extraction Sums of Squared Loadings. The main difference is that there are only two rows of eigenvalues, and the cumulative percent variance goes up to \(51.54\%\).
Total Variance Explained | ||||||
Component | Initial Eigenvalues | Extraction Sums of Squared Loadings | ||||
Total | % of Variance | Cumulative % | Total | % of Variance | Cumulative % | |
1 | 3.057 | 38.206 | 38.206 | 3.057 | 38.206 | 38.206 |
2 | 1.067 | 13.336 | 51.543 | 1.067 | 13.336 | 51.543 |
3 | 0.958 | 11.980 | 63.523 | |||
4 | 0.736 | 9.205 | 72.728 | |||
5 | 0.622 | 7.770 | 80.498 | |||
6 | 0.571 | 7.135 | 87.632 | |||
7 | 0.543 | 6.788 | 94.420 | |||
8 | 0.446 | 5.580 | 100.000 | |||
Extraction Method: Principal Component Analysis. |
Similarly, you will see that the Component Matrix has the same loadings as the eight-component solution but instead of eight columns it’s now two columns.
Component Matrix | ||
Item | Component | |
1 | 2 | |
1 | 0.659 | 0.136 |
2 | -0.300 | 0.866 |
3 | -0.653 | 0.409 |
4 | 0.720 | 0.119 |
5 | 0.650 | 0.096 |
6 | 0.572 | 0.185 |
7 | 0.718 | 0.044 |
8 | 0.568 | 0.267 |
Extraction Method: Principal Component Analysis. | ||
a. 2 components extracted. |
Again, we interpret Item 1 as having a correlation of 0.659 with Component 1. From glancing at the solution, we see that Item 4 has the highest correlation with Component 1 and Item 2 the lowest. Similarly, we see that Item 2 has the highest correlation with Component 2 and Item 7 the lowest.
True or False
1.T, 2.F (sum of squared loadings), 3. T
The communality is the sum of the squared component loadings up to the number of components you extract. In the SPSS output you will see a table of communalities.
Communalities | ||
Initial | Extraction | |
1 | 1.000 | 0.453 |
2 | 1.000 | 0.840 |
3 | 1.000 | 0.594 |
4 | 1.000 | 0.532 |
5 | 1.000 | 0.431 |
6 | 1.000 | 0.361 |
7 | 1.000 | 0.517 |
8 | 1.000 | 0.394 |
Extraction Method: Principal Component Analysis. |
Since PCA is an iterative estimation process, it starts with 1 as an initial estimate of the communality (since this is the total variance across all 8 components), and then proceeds with the analysis until a final communality extracted. Notice that the Extraction column is smaller Initial column because we only extracted two components. As an exercise, let’s manually calculate the first communality from the Component Matrix. The first ordered pair is \((0.659,0.136)\) which represents the correlation of the first item with Component 1 and Component 2. Recall that squaring the loadings and summing down the components (columns) gives us the communality:
$$h^2_1 = (0.659)^2 + (0.136)^2 = 0.453$$
Going back to the Communalities table, if you sum down all 8 items (rows) of the Extraction column, you get \(4.123\). If you go back to the Total Variance Explained table and summed the first two eigenvalues you also get \(3.057+1.067=4.124\). Is that surprising? Basically it’s saying that the summing the communalities across all items is the same as summing the eigenvalues across all components.
1. In a PCA, when would the communality for the Initial column be equal to the Extraction column?
Answer : When you run an 8-component PCA.
1. F, the eigenvalue is the total communality across all items for a single component, 2. T, 3. T, 4. F (you can only sum communalities across items, and sum eigenvalues across components, but if you do that they are equal).
The partitioning of variance differentiates a principal components analysis from what we call common factor analysis. Both methods try to reduce the dimensionality of the dataset down to fewer unobserved variables, but whereas PCA assumes that there common variances takes up all of total variance, common factor analysis assumes that total variance can be partitioned into common and unique variance. It is usually more reasonable to assume that you have not measured your set of items perfectly. The unobserved or latent variable that makes up common variance is called a factor , hence the name factor analysis. The other main difference between PCA and factor analysis lies in the goal of your analysis. If your goal is to simply reduce your variable list down into a linear combination of smaller components then PCA is the way to go. However, if you believe there is some latent construct that defines the interrelationship among items, then factor analysis may be more appropriate. In this case, we assume that there is a construct called SPSS Anxiety that explains why you see a correlation among all the items on the SAQ-8, we acknowledge however that SPSS Anxiety cannot explain all the shared variance among items in the SAQ, so we model the unique variance as well. Based on the results of the PCA, we will start with a two factor extraction.
To run a factor analysis, use the same steps as running a PCA (Analyze – Dimension Reduction – Factor) except under Method choose Principal axis factoring. Note that we continue to set Maximum Iterations for Convergence at 100 and we will see why later.
Pasting the syntax into the SPSS Syntax Editor we get:
Note the main difference is under /EXTRACTION we list PAF for Principal Axis Factoring instead of PC for Principal Components. We will get three tables of output, Communalities, Total Variance Explained and Factor Matrix. Let’s go over each of these and compare them to the PCA output.
Communalities | ||
Item | Initial | Extraction |
1 | 0.293 | 0.437 |
2 | 0.106 | 0.052 |
3 | 0.298 | 0.319 |
4 | 0.344 | 0.460 |
5 | 0.263 | 0.344 |
6 | 0.277 | 0.309 |
7 | 0.393 | 0.851 |
8 | 0.192 | 0.236 |
Extraction Method: Principal Axis Factoring. |
The most striking difference between this communalities table and the one from the PCA is that the initial extraction is no longer one. Recall that for a PCA, we assume the total variance is completely taken up by the common variance or communality, and therefore we pick 1 as our best initial guess. What principal axis factoring does is instead of guessing 1 as the initial communality, it chooses the squared multiple correlation coefficient \(R^2\). To see this in action for Item 1 run a linear regression where Item 1 is the dependent variable and Items 2 -8 are independent variables. Go to Analyze – Regression – Linear and enter q01 under Dependent and q02 to q08 under Independent(s).
Pasting the syntax into the Syntax Editor gives us:
The output we obtain from this analysis is
Model Summary | ||||
Model | R | R Square | Adjusted R Square | Std. Error of the Estimate |
1 | .541 | 0.293 | 0.291 | 0.697 |
Note that 0.293 (highlighted in red) matches the initial communality estimate for Item 1. We can do eight more linear regressions in order to get all eight communality estimates but SPSS already does that for us. Like PCA, factor analysis also uses an iterative estimation process to obtain the final estimates under the Extraction column. Finally, summing all the rows of the extraction column, and we get 3.00. This represents the total common variance shared among all items for a two factor solution.
The next table we will look at is Total Variance Explained. Comparing this to the table from the PCA we notice that the Initial Eigenvalues are exactly the same and includes 8 rows for each “factor”. In fact, SPSS simply borrows the information from the PCA analysis for use in the factor analysis and the factors are actually components in the Initial Eigenvalues column. The main difference now is in the Extraction Sums of Squares Loadings. We notice that each corresponding row in the Extraction column is lower than the Initial column. This is expected because we assume that total variance can be partitioned into common and unique variance, which means the common variance explained will be lower. Factor 1 explains 31.38% of the variance whereas Factor 2 explains 6.24% of the variance. Just as in PCA the more factors you extract, the less variance explained by each successive factor.
Total Variance Explained | ||||||
Factor | Initial Eigenvalues | Extraction Sums of Squared Loadings | ||||
Total | % of Variance | Cumulative % | Total | % of Variance | Cumulative % | |
1 | 3.057 | 38.206 | 38.206 | 2.511 | 31.382 | 31.382 |
2 | 1.067 | 13.336 | 51.543 | 0.499 | 6.238 | 37.621 |
3 | 0.958 | 11.980 | 63.523 | |||
4 | 0.736 | 9.205 | 72.728 | |||
5 | 0.622 | 7.770 | 80.498 | |||
6 | 0.571 | 7.135 | 87.632 | |||
7 | 0.543 | 6.788 | 94.420 | |||
8 | 0.446 | 5.580 | 100.000 | |||
Extraction Method: Principal Axis Factoring. |
A subtle note that may be easily overlooked is that when SPSS plots the scree plot or the Eigenvalues greater than 1 criteria (Analyze – Dimension Reduction – Factor – Extraction), it bases it off the Initial and not the Extraction solution. This is important because the criteria here assumes no unique variance as in PCA, which means that this is the total variance explained not accounting for specific or measurement error. Note that in the Extraction of Sums Squared Loadings column the second factor has an eigenvalue that is less than 1 but is still retained because the Initial value is 1.067. If you want to use this criteria for the common variance explained you would need to modify the criteria yourself.
Answers: 1. When there is no unique variance (PCA assumes this whereas common factor analysis does not, so this is in theory and not in practice), 2. F, it uses the initial PCA solution and the eigenvalues assume no unique variance.
Factor Matrix | ||
Item | Factor | |
1 | 2 | |
1 | 0.588 | -0.303 |
2 | -0.227 | 0.020 |
3 | -0.557 | 0.094 |
4 | 0.652 | -0.189 |
5 | 0.560 | -0.174 |
6 | 0.498 | 0.247 |
7 | 0.771 | 0.506 |
8 | 0.470 | -0.124 |
Extraction Method: Principal Axis Factoring. | ||
a. 2 factors extracted. 79 iterations required. |
First note the annotation that 79 iterations were required. If we had simply used the default 25 iterations in SPSS, we would not have obtained an optimal solution. This is why in practice it’s always good to increase the maximum number of iterations. Now let’s get into the table itself. The elements of the Factor Matrix table are called loadings and represent the correlation of each item with the corresponding factor. Just as in PCA, squaring each loading and summing down the items (rows) gives the total variance explained by each factor. Note that they are no longer called eigenvalues as in PCA. Let’s calculate this for Factor 1:
$$(0.588)^2 + (-0.227)^2 + (-0.557)^2 + (0.652)^2 + (0.560)^2 + (0.498)^2 + (0.771)^2 + (0.470)^2 = 2.51$$
This number matches the first row under the Extraction column of the Total Variance Explained table. We can repeat this for Factor 2 and get matching results for the second row. Additionally, we can get the communality estimates by summing the squared loadings across the factors (columns) for each item. For example, for Item 1:
$$(0.588)^2 + (-0.303)^2 = 0.437$$
Note that these results match the value of the Communalities table for Item 1 under the Extraction column. This means that the sum of squared loadings across factors represents the communality estimates for each item.
To see the relationships among the three tables let’s first start from the Factor Matrix (or Component Matrix in PCA). We will use the term factor to represent components in PCA as well. These elements represent the correlation of the item with each factor. Now, square each element to obtain squared loadings or the proportion of variance explained by each factor for each item. Summing the squared loadings across factors you get the proportion of variance explained by all factors in the model. This is known as common variance or communality, hence the result is the Communalities table. Going back to the Factor Matrix, if you square the loadings and sum down the items you get Sums of Squared Loadings (in PAF) or eigenvalues (in PCA) for each factor. These now become elements of the Total Variance Explained table. Summing down the rows (i.e., summing down the factors) under the Extraction column we get \(2.511 + 0.499 = 3.01\) or the total (common) variance explained. In words, this is the total (common) variance explained by the two factor solution for all eight items. Equivalently, since the Communalities table represents the total common variance explained by both factors for each item, summing down the items in the Communalities table also gives you the total (common) variance explained, in this case
$$ (0.437)^2 + (0.052)^2 + (0.319)^2 + (0.460)^2 + (0.344)^2 + (0.309)^2 + (0.851)^2 + (0.236)^2 = 3.01$$
which is the same result we obtained from the Total Variance Explained table. Here is a table that that may help clarify what we’ve talked about:
In summary:
True or False (the following assumes a two-factor Principal Axis Factor solution with 8 items)
Answers: 1. T, 2. F, the sum of the squared elements across both factors, 3. T, 4. T, 5. F, sum all eigenvalues from the Extraction column of the Total Variance Explained table, 6. F, the total Sums of Squared Loadings represents only the total common variance excluding unique variance, 7. F, eigenvalues are only applicable for PCA.
Since this is a non-technical introduction to factor analysis, we won’t go into detail about the differences between Principal Axis Factoring (PAF) and Maximum Likelihood (ML). The main concept to know is that ML also assumes a common factor analysis using the \(R^2\) to obtain initial estimates of the communalities, but uses a different iterative process to obtain the extraction solution. To run a factor analysis using maximum likelihood estimation under Analyze – Dimension Reduction – Factor – Extraction – Method choose Maximum Likelihood.
Although the initial communalities are the same between PAF and ML, the final extraction loadings will be different, which means you will have different Communalities, Total Variance Explained, and Factor Matrix tables (although Initial columns will overlap). The other main difference is that you will obtain a Goodness-of-fit Test table, which gives you a absolute test of model fit. Non-significant values suggest a good fitting model. Here the p -value is less than 0.05 so we reject the two-factor model.
Goodness-of-fit Test | ||
Chi-Square | df | Sig. |
198.617 | 13 | 0.000 |
In practice, you would obtain chi-square values for multiple factor analysis runs, which we tabulate below from 1 to 8 factors. The table shows the number of factors extracted (or attempted to extract) as well as the chi-square, degrees of freedom, p-value and iterations needed to converge. Note that as you increase the number of factors, the chi-square value and degrees of freedom decreases but the iterations needed and p-value increases. Practically, you want to make sure the number of iterations you specify exceeds the iterations needed. Additionally, NS means no solution and N/A means not applicable. In SPSS, no solution is obtained when you run 5 to 7 factors because the degrees of freedom is negative (which cannot happen). For the eight factor solution, it is not even applicable in SPSS because it will spew out a warning that “You cannot request as many factors as variables with any extraction method except PC. The number of factors will be reduced by one.” This means that if you try to extract an eight factor solution for the SAQ-8, it will default back to the 7 factor solution. Now that we understand the table, let’s see if we can find the threshold at which the absolute fit indicates a good fitting model. It looks like here that the p -value becomes non-significant at a 3 factor solution. Note that differs from the eigenvalues greater than 1 criteria which chose 2 factors and using Percent of Variance explained you would choose 4-5 factors. We talk to the Principal Investigator and at this point, we still prefer the two-factor solution. Note that there is no “right” answer in picking the best factor model, only what makes sense for your theory. We will talk about interpreting the factor loadings when we talk about factor rotation to further guide us in choosing the correct number of factors.
Number of Factors | Chi-square | Df | -value | Iterations needed |
1 | 553.08 | 20 | <0.05 | 4 |
2 | 198.62 | 13 | < 0.05 | 39 |
3 | 13.81 | 7 | 0.055 | 57 |
4 | 1.386 | 2 | 0.5 | 168 |
5 | NS | -2 | NS | NS |
6 | NS | -5 | NS | NS |
7 | NS | -7 | NS | NS |
8 | N/A | N/A | N/A | N/A |
Answers: 1. T, 2. F, the two use the same starting communalities but a different estimation process to obtain extraction loadings, 3. F, only Maximum Likelihood gives you chi-square values, 4. F, you can extract as many components as items in PCA, but SPSS will only extract up to the total number of items minus 1, 5. F, greater than 0.05, 6. T, we are taking away degrees of freedom but extracting more factors.
As we mentioned before, the main difference between common factor analysis and principal components is that factor analysis assumes total variance can be partitioned into common and unique variance, whereas principal components assumes common variance takes up all of total variance (i.e., no unique variance). For both methods, when you assume total variance is 1, the common variance becomes the communality. The communality is unique to each item, so if you have 8 items, you will obtain 8 communalities; and it represents the common variance explained by the factors or components. However in the case of principal components, the communality is the total variance of each item, and summing all 8 communalities gives you the total variance across all items. In contrast, common factor analysis assumes that the communality is a portion of the total variance, so that summing up the communalities represents the total common variance and not the total variance. In summary, for PCA, total common variance is equal to total variance explained , which in turn is equal to the total variance, but in common factor analysis, total common variance is equal to total variance explained but does not equal total variance.
The following applies to the SAQ-8 when theoretically extracting 8 components or factors for 8 items:
Answers: 1. T, 2. F, the total variance for each item, 3. T, 4. F, communality is unique to each item (shared across components or factors), 5. T, 6. T.
After deciding on the number of factors to extract and with analysis model to use, the next step is to interpret the factor loadings. Factor rotations help us interpret factor loadings. There are two general types of rotations, orthogonal and oblique.
The goal of factor rotation is to improve the interpretability of the factor solution by reaching simple structure.
Without rotation, the first factor is the most general factor onto which most items load and explains the largest amount of variance. This may not be desired in all cases. Suppose you wanted to know how well a set of items load on each factor; simple structure helps us to achieve this.
The definition of simple structure is that in a factor loading matrix:
For every pair of factors (columns),
The following table is an example of simple structure with three factors:
Item | Factor 1 | Factor 2 | Factor 3 |
1 | 0.8 | 0 | 0 |
2 | 0.8 | 0 | 0 |
3 | 0.8 | 0 | 0 |
4 | 0 | 0.8 | 0 |
5 | 0 | 0.8 | 0 |
6 | 0 | 0.8 | 0 |
7 | 0 | 0 | 0.8 |
8 | 0 | 0 | 0.8 |
Let’s go down the checklist to criteria to see why it satisfies simple structure:
An easier criteria from Pedhazur and Schemlkin (1991) states that
For the following factor matrix, explain why it does not conform to simple structure using both the conventional and Pedhazur test.
Item | Factor 1 | Factor 2 | Factor 3 |
1 | 0.8 | 0 | 0.8 |
2 | 0.8 | 0 | 0.8 |
3 | 0.8 | 0 | 0 |
4 | 0.8 | 0 | 0 |
5 | 0 | 0.8 | 0.8 |
6 | 0 | 0.8 | 0.8 |
7 | 0 | 0.8 | 0.8 |
8 | 0 | 0.8 | 0 |
Solution: Using the conventional test, although Criteria 1 and 2 are satisfied (each row has at least one zero, each column has at least three zeroes), Criteria 3 fails because for Factors 2 and 3, only 3/8 rows have 0 on one factor and non-zero on the other. Additionally, for Factors 2 and 3, only Items 5 through 7 have non-zero loadings or 3/8 rows have non-zero coefficients (fails Criteria 4 and 5 simultaneously). Using the Pedhazur method, Items 1, 2, 5, 6, and 7 have high loadings on two factors (fails first criteria) and Factor 3 has high loadings on a majority or 5/8 items (fails second criteria).
We know that the goal of factor rotation is to rotate the factor matrix so that it can approach simple structure in order to improve interpretability. Orthogonal rotation assumes that the factors are not correlated. The benefit of doing an orthogonal rotation is that loadings are simple correlations of items with factors, and standardized solutions can estimate unique contribution of each factor. The most common type of orthogonal rotation is Varimax rotation. We will walk through how to do this in SPSS.
The steps to running a two-factor Principal Axis Factoring is the same as before (Analyze – Dimension Reduction – Factor – Extraction), except that under Rotation – Method we check Varimax. Make sure under Display to check Rotated Solution and Loading plot(s), and under Maximum Iterations for Convergence enter 100.
Pasting the syntax into the SPSS editor you obtain:
Let’s first talk about what tables are the same or different from running a PAF with no rotation. First, we know that the unrotated factor matrix (Factor Matrix table) should be the same. Additionally, since the common variance explained by both factors should be the same, the Communalities table should be the same. The main difference is that we ran a rotation, so we should get the rotated solution (Rotated Factor Matrix) as well as the transformation used to obtain the rotation (Factor Transformation Matrix). Finally, although the total variance explained by all factors stays the same, the total variance explained by each factor will be different.
Rotated Factor Matrix | ||
Factor | ||
1 | 2 | |
1 | 0.646 | 0.139 |
2 | -0.188 | -0.129 |
3 | -0.490 | -0.281 |
4 | 0.624 | 0.268 |
5 | 0.544 | 0.221 |
6 | 0.229 | 0.507 |
7 | 0.275 | 0.881 |
8 | 0.442 | 0.202 |
Extraction Method: Principal Axis Factoring. Rotation Method: Varimax with Kaiser Normalization. | ||
a. Rotation converged in 3 iterations. |
The Rotated Factor Matrix table tells us what the factor loadings look like after rotation (in this case Varimax). Kaiser normalization is a method to obtain stability of solutions across samples. After rotation, the loadings are rescaled back to the proper size. This means that equal weight is given to all items when performing the rotation. The only drawback is if the communality is low for a particular item, Kaiser normalization will weight these items equally with items with high communality. As such, Kaiser normalization is preferred when communalities are high across all items. You can turn off Kaiser normalization by specifying
Here is what the Varimax rotated loadings look like without Kaiser normalization. Compared to the rotated factor matrix with Kaiser normalization the patterns look similar if you flip Factors 1 and 2; this may be an artifact of the rescaling. Another possible reasoning for the stark differences may be due to the low communalities for Item 2 (0.052) and Item 8 (0.236). Kaiser normalization weights these items equally with the other high communality items.
Rotated Factor Matrix | ||
Factor | ||
1 | 2 | |
1 | 0.207 | 0.628 |
2 | -0.148 | -0.173 |
3 | -0.331 | -0.458 |
4 | 0.332 | 0.592 |
5 | 0.277 | 0.517 |
6 | 0.528 | 0.174 |
7 | 0.905 | 0.180 |
8 | 0.248 | 0.418 |
Extraction Method: Principal Axis Factoring. Rotation Method: Varimax without Kaiser Normalization. | ||
a. Rotation converged in 3 iterations. |
In the table above, the absolute loadings that are higher than 0.4 are highlighted in blue for Factor 1 and in red for Factor 2. We can see that Items 6 and 7 load highly onto Factor 1 and Items 1, 3, 4, 5, and 8 load highly onto Factor 2. Item 2 does not seem to load highly on any factor. Looking more closely at Item 6 “My friends are better at statistics than me” and Item 7 “Computers are useful only for playing games”, we don’t see a clear construct that defines the two. Item 2, “I don’t understand statistics” may be too general an item and isn’t captured by SPSS Anxiety. It’s debatable at this point whether to retain a two-factor or one-factor solution, at the very minimum we should see if Item 2 is a candidate for deletion.
The Factor Transformation Matrix tells us how the Factor Matrix was rotated. In SPSS, you will see a matrix with two rows and two columns because we have two factors.
Factor Transformation Matrix | ||
Factor | 1 | 2 |
1 | 0.773 | 0.635 |
2 | -0.635 | 0.773 |
Extraction Method: Principal Axis Factoring. Rotation Method: Varimax with Kaiser Normalization. |
How do we interpret this matrix? Well, we can see it as the way to move from the Factor Matrix to the Rotated Factor Matrix. From the Factor Matrix we know that the loading of Item 1 on Factor 1 is \(0.588\) and the loading of Item 1 on Factor 2 is \(-0.303\), which gives us the pair \((0.588,-0.303)\); but in the Rotated Factor Matrix the new pair is \((0.646,0.139)\). How do we obtain this new transformed pair of values? We can do what’s called matrix multiplication. The steps are essentially to start with one column of the Factor Transformation matrix, view it as another ordered pair and multiply matching ordered pairs. To get the first element, we can multiply the ordered pair in the Factor Matrix \((0.588,-0.303)\) with the matching ordered pair \((0.773,-0.635)\) in the first column of the Factor Transformation Matrix.
$$(0.588)(0.773)+(-0.303)(-0.635)=0.455+0.192=0.647.$$
To get the second element, we can multiply the ordered pair in the Factor Matrix \((0.588,-0.303)\) with the matching ordered pair \((0.773,-0.635)\) from the second column of the Factor Transformation Matrix:
$$(0.588)(0.635)+(-0.303)(0.773)=0.373-0.234=0.139.$$
Voila! We have obtained the new transformed pair with some rounding error. The figure below summarizes the steps we used to perform the transformation
The Factor Transformation Matrix can also tell us angle of rotation if we take the inverse cosine of the diagonal element. In this case, the angle of rotation is \(cos^{-1}(0.773) =39.4 ^{\circ}\). In the factor loading plot, you can see what that angle of rotation looks like, starting from \(0^{\circ}\) rotating up in a counterclockwise direction by \(39.4^{\circ}\). Notice here that the newly rotated x and y-axis are still at \(90^{\circ}\) angles from one another, hence the name orthogonal (a non-orthogonal or oblique rotation means that the new axis is no longer \(90^{\circ}\) apart. The points do not move in relation to the axis but rotate with it.
The Total Variance Explained table contains the same columns as the PAF solution with no rotation, but adds another set of columns called “Rotation Sums of Squared Loadings”. This makes sense because if our rotated Factor Matrix is different, the square of the loadings should be different, and hence the Sum of Squared loadings will be different for each factor. However, if you sum the Sums of Squared Loadings across all factors for the Rotation solution,
$$ 1.701 + 1.309 = 3.01$$
and for the unrotated solution,
$$ 2.511 + 0.499 = 3.01,$$
you will see that the two sums are the same. This is because rotation does not change the total common variance. Looking at the Rotation Sums of Squared Loadings for Factor 1, it still has the largest total variance, but now that shared variance is split more evenly.
Total Variance Explained | |||
Factor | Rotation Sums of Squared Loadings | ||
Total | % of Variance | Cumulative % | |
1 | 1.701 | 21.258 | 21.258 |
2 | 1.309 | 16.363 | 37.621 |
Extraction Method: Principal Axis Factoring. |
Varimax rotation is the most popular but one among other orthogonal rotations. The benefit of Varimax rotation is that it maximizes the variances of the loadings within the factors while maximizing differences between high and low loadings on a particular factor. Higher loadings are made higher while lower loadings are made lower. This makes Varimax rotation good for achieving simple structure but not as good for detecting an overall factor because it splits up variance of major factors among lesser ones. Quartimax may be a better choice for detecting an overall factor. It maximizes the squared loadings so that each item loads most strongly onto a single factor.
Here is the output of the Total Variance Explained table juxtaposed side-by-side for Varimax versus Quartimax rotation.
Total Variance Explained | ||
Factor | Quartimax | Varimax |
Total | Total | |
1 | 2.381 | 1.701 |
2 | 0.629 | 1.309 |
Extraction Method: Principal Axis Factoring. |
You will see that whereas Varimax distributes the variances evenly across both factors, Quartimax tries to consolidate more variance into the first factor.
Equamax is a hybrid of Varimax and Quartimax, but because of this may behave erratically and according to Pett et al. (2003), is not generally recommended.
In oblique rotation, the factors are no longer orthogonal to each other (x and y axes are not \(90^{\circ}\) angles to each other). Like orthogonal rotation, the goal is rotation of the reference axes about the origin to achieve a simpler and more meaningful factor solution compared to the unrotated solution. In oblique rotation, you will see three unique tables in the SPSS output:
Suppose the Principal Investigator hypothesizes that the two factors are correlated, and wishes to test this assumption. Let’s proceed with one of the most common types of oblique rotations in SPSS, Direct Oblimin.
The steps to running a Direct Oblimin is the same as before (Analyze – Dimension Reduction – Factor – Extraction), except that under Rotation – Method we check Direct Oblimin. The other parameter we have to put in is delta , which defaults to zero. Technically, when delta = 0, this is known as Direct Quartimin. Larger positive values for delta increases the correlation among factors. However, in general you don’t want the correlations to be too high or else there is no reason to split your factors up. In fact, SPSS caps the delta value at 0.8 (the cap for negative values is -9999). Negative delta factors may lead to orthogonal factor solutions. For the purposes of this analysis, we will leave our delta = 0 and do a Direct Quartimin analysis.
All the questions below pertain to Direct Oblimin in SPSS.
Answers: 1. T, 2. F, larger delta values, 3. F, delta leads to higher factor correlations, in general you don’t want factors to be too highly correlated
The factor pattern matrix represent partial standardized regression coefficients of each item with a particular factor. For example, \(0.740\) is the effect of Factor 1 on Item 1 controlling for Factor 2 and \(-0.137\) is the effect of Factor 2 on Item 1 controlling for Factor 1. Just as in orthogonal rotation, the square of the loadings represent the contribution of the factor to the variance of the item, but excluding the overlap between correlated factors. Factor 1 uniquely contributes \((0.740)^2=0.405=40.5\%\) of the variance in Item 1 (controlling for Factor 2 ), and Factor 2 uniquely contributes \((-0.137)^2=0.019=1.9%\) of the variance in Item 1 (controlling for Factor 1).
Pattern Matrix | ||
Factor | ||
1 | 2 | |
1 | 0.740 | -0.137 |
2 | -0.180 | -0.067 |
3 | -0.490 | -0.108 |
4 | 0.660 | 0.029 |
5 | 0.580 | 0.011 |
6 | 0.077 | 0.504 |
7 | -0.017 | 0.933 |
8 | 0.462 | 0.036 |
Extraction Method: Principal Axis Factoring. Rotation Method: Oblimin with Kaiser Normalization. | ||
a. Rotation converged in 5 iterations. |
The factor structure matrix represent the simple zero-order correlations of the items with each factor (it’s as if you ran a simple regression of a single factor on the outcome). For example, \(0.653\) is the simple correlation of Factor 1 on Item 1 and \(0.333\) is the simple correlation of Factor 2 on Item 1. The more correlated the factors, the more difference between pattern and structure matrix and the more difficult to interpret the factor loadings. From this we can see that Items 1, 3, 4, 5, and 8 load highly onto Factor 1 and Items 6, and 7 load highly onto Factor 2. Item 2 doesn’t seem to load well on either factor.
Additionally, we can look at the variance explained by each factor not controlling for the other factors. For example, Factor 1 contributes \((0.653)^2=0.426=42.6\%\) of the variance in Item 1, and Factor 2 contributes \((0.333)^2=0.11=11.0%\) of the variance in Item 1. Notice that the contribution in variance of Factor 2 is higher \(11\%\) vs. \(1.9\%\) because in the Pattern Matrix we controlled for the effect of Factor 1, whereas in the Structure Matrix we did not.
Structure Matrix | ||
Factor | ||
1 | 2 | |
1 | 0.653 | 0.333 |
2 | -0.222 | -0.181 |
3 | -0.559 | -0.420 |
4 | 0.678 | 0.449 |
5 | 0.587 | 0.380 |
6 | 0.398 | 0.553 |
7 | 0.577 | 0.923 |
8 | 0.485 | 0.330 |
Extraction Method: Principal Axis Factoring. Rotation Method: Oblimin with Kaiser Normalization. |
Recall that the more correlated the factors, the more difference between pattern and structure matrix and the more difficult to interpret the factor loadings. In our case, Factor 1 and Factor 2 are pretty highly correlated, which is why there is such a big difference between the factor pattern and factor structure matrices.
Factor Correlation Matrix | ||
Factor | 1 | 2 |
1 | 1.000 | 0.636 |
2 | 0.636 | 1.000 |
Extraction Method: Principal Axis Factoring. Rotation Method: Oblimin with Kaiser Normalization. |
The difference between an orthogonal versus oblique rotation is that the factors in an oblique rotation are correlated. This means not only must we account for the angle of axis rotation \(\theta\), we have to account for the angle of correlation \(\phi\). The angle of axis rotation is defined as the angle between the rotated and unrotated axes (blue and black axes). From the Factor Correlation Matrix, we know that the correlation is \(0.636\), so the angle of correlation is \(cos^{-1}(0.636) = 50.5^{\circ}\), which is the angle between the two rotated axes (blue x and blue y-axis). The sum of rotations \(\theta\) and \(\phi\) is the total angle rotation. We are not given the angle of axis rotation, so we only know that the total angle rotation is \(\theta + \phi = \theta + 50.5^{\circ}\).
The structure matrix is in fact a derivative of the pattern matrix. If you multiply the pattern matrix by the factor correlation matrix, you will get back the factor structure matrix. Let’s take the example of the ordered pair \((0.740,-0.137)\) from the Pattern Matrix, which represents the partial correlation of Item 1 with Factors 1 and 2 respectively. Performing matrix multiplication for the first column of the Factor Correlation Matrix we get
$$ (0.740)(1) + (-0.137)(0.636) = 0.740 – 0.087 =0.652.$$
Similarly, we multiple the ordered factor pair with the second column of the Factor Correlation Matrix to get:
$$ (0.740)(0.636) + (-0.137)(1) = 0.471 -0.137 =0.333 $$
Looking at the first row of the Structure Matrix we get \((0.653,0.333)\) which matches our calculation! This neat fact can be depicted with the following figure:
As a quick aside, suppose that the factors are orthogonal, which means that the factor correlations are 1′ s on the diagonal and zeros on the off-diagonal, a quick calculation with the ordered pair \((0.740,-0.137)\)
$$ (0.740)(1) + (-0.137)(0) = 0.740$$
and similarly,
$$ (0.740)(0) + (-0.137)(1) = -0.137$$
and you get back the same ordered pair. This is called multiplying by the identity matrix (think of it as multiplying \(2*1 = 2\)).
Answers: 1. Decrease the delta values so that the correlation between factors approaches zero. 2. T, the correlations will become more orthogonal and hence the pattern and structure matrix will be closer.
The column Extraction Sums of Squared Loadings is the same as the unrotated solution, but we have an additional column known as Rotation Sums of Squared Loadings. SPSS says itself that “when factors are correlated, sums of squared loadings cannot be added to obtain total variance”. You will note that compared to the Extraction Sums of Squared Loadings, the Rotation Sums of Squared Loadings is only slightly lower for Factor 1 but much higher for Factor 2. This is because unlike orthogonal rotation, this is no longer the unique contribution of Factor 1 and Factor 2. How do we obtain the Rotation Sums of Squared Loadings? SPSS squares the Structure Matrix and sums down the items.
Total Variance Explained | ||||
Factor | Extraction Sums of Squared Loadings | Rotation Sums of Squared Loadings | ||
Total | % of Variance | Cumulative % | Total | |
1 | 2.511 | 31.382 | 31.382 | 2.318 |
2 | 0.499 | 6.238 | 37.621 | 1.931 |
Extraction Method: Principal Axis Factoring. | ||||
a. When factors are correlated, sums of squared loadings cannot be added to obtain a total variance. |
As a demonstration, let’s obtain the loadings from the Structure Matrix for Factor 1
$$ (0.653)^2 + (-0.222)^2 + (-0.559)^2 + (0.678)^2 + (0.587)^2 + (0.398)^2 + (0.577)^2 + (0.485)^2 = 2.318.$$
Note that \(2.318\) matches the Rotation Sums of Squared Loadings for the first factor. This means that the Rotation Sums of Squared Loadings represent the non- unique contribution of each factor to total common variance, and summing these squared loadings for all factors can lead to estimates that are greater than total variance.
Finally, let’s conclude by interpreting the factors loadings more carefully. Let’s compare the Pattern Matrix and Structure Matrix tables side-by-side. First we highlight absolute loadings that are higher than 0.4 in blue for Factor 1 and in red for Factor 2. We see that the absolute loadings in the Pattern Matrix are in general higher in Factor 1 compared to the Structure Matrix and lower for Factor 2. This makes sense because the Pattern Matrix partials out the effect of the other factor. Looking at the Pattern Matrix, Items 1, 3, 4, 5, and 8 load highly on Factor 1, and Items 6 and 7 load highly on Factor 2. Looking at the Structure Matrix, Items 1, 3, 4, 5, 7 and 8 are highly loaded onto Factor 1 and Items 3, 4, and 7 load highly onto Factor 2. Item 2 doesn’t seem to load on any factor. The results of the two matrices are somewhat inconsistent but can be explained by the fact that in the Structure Matrix Items 3, 4 and 7 seem to load onto both factors evenly but not in the Pattern Matrix. For this particular analysis, it seems to make more sense to interpret the Pattern Matrix because it’s clear that Factor 1 contributes uniquely to most items in the SAQ-8 and Factor 2 contributes common variance only to two items (Items 6 and 7). There is an argument here that perhaps Item 2 can be eliminated from our survey and to consolidate the factors into one SPSS Anxiety factor. We talk to the Principal Investigator and we think it’s feasible to accept SPSS Anxiety as the single factor explaining the common variance in all the items, but we choose to remove Item 2, so that the SAQ-8 is now the SAQ-7.
Pattern Matrix | Structure Matrix | |||
Factor | Factor | |||
1 | 2 | 1 | 2 | |
1 | 0.740 | -0.137 | 0.653 | 0.333 |
2 | -0.180 | -0.067 | -0.222 | -0.181 |
3 | -0.490 | -0.108 | -0.559 | -0.420 |
4 | 0.660 | 0.029 | 0.678 | 0.449 |
5 | 0.580 | 0.011 | 0.587 | 0.380 |
6 | 0.077 | 0.504 | 0.398 | 0.553 |
7 | -0.017 | 0.933 | 0.577 | 0.923 |
8 | 0.462 | 0.036 | 0.485 | 0.330 |
Answers: 1. T, 2. F, represent the non -unique contribution (which means the total sum of squares can be greater than the total communality), 3. F, the Structure Matrix is obtained by multiplying the Pattern Matrix with the Factor Correlation Matrix, 4. T, it’s like multiplying a number by 1, you get the same number back, 5. F, this is true only for orthogonal rotations, the SPSS Communalities table in rotated factor solutions is based off of the unrotated solution, not the rotated solution.
As a special note, did we really achieve simple structure? Although rotation helps us achieve simple structure, if the interrelationships do not hold itself up to simple structure, we can only modify our model. In this case we chose to remove Item 2 from our model.
Promax rotation begins with Varimax (orthgonal) rotation, and uses Kappa to raise the power of the loadings. Promax really reduces the small loadings. Promax also runs faster than Varimax, and in our example Promax took 3 iterations while Direct Quartimin (Direct Oblimin with Delta =0) took 5 iterations.
Answers: 1. T.
Suppose the Principal Investigator is happy with the final factor analysis which was the two-factor Direct Quartimin solution. She has a hypothesis that SPSS Anxiety and Attribution Bias predict student scores on an introductory statistics course, so would like to use the factor scores as a predictor in this new regression analysis. Since a factor is by nature unobserved, we need to first predict or generate plausible factor scores. In SPSS, there are three methods to factor score generation, Regression, Bartlett, and Anderson-Rubin.
In order to generate factor scores, run the same factor analysis model but click on Factor Scores (Analyze – Dimension Reduction – Factor – Factor Scores). Then check Save as variables, pick the Method and optionally check Display factor score coefficient matrix.
The code pasted in the SPSS Syntax Editor looksl like this:
Here we picked the Regression approach after fitting our two-factor Direct Quartimin solution. After generating the factor scores, SPSS will add two extra variables to the end of your variable list, which you can view via Data View. The figure below shows what this looks like for the first 5 participants, which SPSS calls FAC1_1 and FAC2_1 for the first and second factors. These are now ready to be entered in another analysis as predictors.
For those who want to understand how the scores are generated, we can refer to the Factor Score Coefficient Matrix. These are essentially the regression weights that SPSS uses to generate the scores. We know that the ordered pair of scores for the first participant is \(-0.880, -0.113\). We also know that the 8 scores for the first participant are \(2, 1, 4, 2, 2, 2, 3, 1\). However, what SPSS uses is actually the standardized scores, which can be easily obtained in SPSS by using Analyze – Descriptive Statistics – Descriptives – Save standardized values as variables. The standardized scores obtained are: \(-0.452, -0.733, 1.32, -0.829, -0.749, -0.2025, 0.069, -1.42\). Using the Factor Score Coefficient matrix, we multiply the participant scores by the coefficient matrix for each column. For the first factor:
$$ \begin{eqnarray} &(0.284) (-0.452) + (-0.048)-0.733) + (-0.171)(1.32) + (0.274)(-0.829) \\ &+ (0.197)(-0.749) +(0.048)(-0.2025) + (0.174) (0.069) + (0.133)(-1.42) \\ &= -0.880, \end{eqnarray} $$
which matches FAC1_1 for the first participant. You can continue this same procedure for the second factor to obtain FAC2_1.
Factor Score Coefficient Matrix | ||
Item | Factor | |
1 | 2 | |
1 | 0.284 | 0.005 |
2 | -0.048 | -0.019 |
3 | -0.171 | -0.045 |
4 | 0.274 | 0.045 |
5 | 0.197 | 0.036 |
6 | 0.048 | 0.095 |
7 | 0.174 | 0.814 |
8 | 0.133 | 0.028 |
Extraction Method: Principal Axis Factoring. Rotation Method: Oblimin with Kaiser Normalization. Factor Scores Method: Regression. |
The second table is the Factor Score Covariance Matrix,
Factor Score Covariance Matrix | ||
Factor | 1 | 2 |
1 | 1.897 | 1.895 |
2 | 1.895 | 1.990 |
Extraction Method: Principal Axis Factoring. Rotation Method: Oblimin with Kaiser Normalization. Factor Scores Method: Regression. |
This table can be interpreted as the covariance matrix of the factor scores, however it would only be equal to the raw covariance if the factors are orthogonal. For example, if we obtained the raw covariance matrix of the factor scores we would get
Correlations | |||
FAC1_1 | FAC1_2 | ||
FAC1_1 | Covariance | 0.777 | 0.604 |
FAC1_2 | Covariance | 0.604 | 0.870 |
You will notice that these values are much lower. Let’s compare the same two tables but for Varimax rotation:
Factor Score Covariance Matrix | ||
Factor | 1 | 2 |
1 | 0.670 | 0.131 |
2 | 0.131 | 0.805 |
Extraction Method: Principal Axis Factoring. Rotation Method: Varimax with Kaiser Normalization. Factor Scores Method: Regression. |
If you compare these elements to the Covariance table below, you will notice they are the same.
Correlations | |||
FAC1_1 | FAC1_2 | ||
FAC1_1 | Covariance | 0.670 | 0.131 |
FAC1_2 | Covariance | 0.131 | 0.805 |
Note with the Bartlett and Anderson-Rubin methods you will not obtain the Factor Score Covariance matrix.
Among the three methods, each has its pluses and minuses. The regression method maximizes the correlation (and hence validity) between the factor scores and the underlying factor but the scores can be somewhat biased. This means even if you have an orthogonal solution, you can still have correlated factor scores. For Bartlett’s method, the factor scores highly correlate with its own factor and not with others, and they are an unbiased estimate of the true factor score. Unbiased scores means that with repeated sampling of the factor scores, the average of the scores is equal to the average of the true factor score. The Anderson-Rubin method perfectly scales the factor scores so that the factor scores are uncorrelated with other factors and uncorrelated with other factor scores . Since Anderson-Rubin scores impose a correlation of zero between factor scores, it is not the best option to choose for oblique rotations. Additionally, Anderson-Rubin scores are biased.
In summary, if you do an orthogonal rotation, you can pick any of the the three methods. For orthogonal rotations, use Bartlett if you want unbiased scores, use the regression method if you want to maximize validity and use Anderson-Rubin if you want the factor scores themselves to be uncorrelated with other factor scores. If you do oblique rotations, it’s preferable to stick with the Regression method. Do not use Anderson-Rubin for oblique rotations.
Answers: 1. T, 2. T, 3. T
Your Name (required)
Your Email (must be a valid email for us to receive the report!)
Comment/Error Report (required)
How to cite this page
Data is everywhere. From data research to artificial intelligence technology, data has become an essential commodity that is being perceived as a link between our past and future. Is an organization willing to collect its past records?
Data is the key solution to this problem. Is any programmer willing to formulate a Machine Learning algorithm ? Data is what s/he needs to begin with.
While the world has moved on to technology, it still is unaware of the fact that data is the building block of all these technological advancements that have together made the world so advanced.
When it comes to data, a number of tools and techniques are put to work to arrange, organize, and accumulate data the way one wants to. Factor Analysis is one of them. A data reduction technique, Factor Analysis is a statistical method used to reduce the number of observed factors for a much better insight into a given dataset.
But first, we shall understand what is a factor. A factor is a set of observed variables that have similar responses to an action. Since variables in a given dataset can be too much to deal with, Factor Analysis condenses these factors or variables into fewer variables that are actionable and substantial to work upon.
A technique of dimensionality reduction in data mining, Factor Analysis works on narrowing the availability of variables in a given data set, allowing deeper insights and better visibility of patterns for data research.
Most commonly used to identify the relationship between various variables in statistics , Factor Analysis can be thought of as a compressor that compresses the size of variables and produces a much enhanced, insightful, and accurate variable set.
“FA is considered an extension of principal component analysis since the ultimate objective for both techniques is a data reduction.” Factor Analysis in Data Reduction
Developed in 1904 by Spearman, Factor Analysis is broadly divided into various types based upon the approach to detect underlying variables and establish a relationship between them.
While there are a variety of techniques to conduct factor analysis like Principal Component Analysis or Independent Component Analysis , Factor Analysis can be divided into 2 types which we will discuss below. Let us get started.
As the name of this concept suggests, Confirmatory Factor Analysis (CFA) lets one determine whether a relationship between factors or a set of overserved variables and their underlying components exists.
It helps one confirm whether there is a connection between two components of variables in a given dataset. Usually, the purpose of CFA is to test whether certain data fit the requirements of a particular hypothesis.
The process begins with a researcher formulating a hypothesis that is made to fit along the lines of a certain theory. If the constraints imposed on a model do not fit well with the data, then the model is rejected, and it is confirmed that no relationship exists between a factor and its underlying construct. Perhaps hypothetical testing also finds a space in the world of Factor Analysis.
In the case of Exploratory Factor Statistical Analysis , the purpose is to determine/explore the underlying latent structure of a large set of variables. EFA, unlike CFA, tends to uncover the relationship, if any, between measured variables of an entity (for example - height, weight, etc. in a human figure).
While CFA works on finding a relationship between a set of observed variables and their underlying structure, this works to uncover a relationship between various variables within a given dataset.
Conducting Exploratory Factor Analysis involves figuring the total number of factors involved in a dataset.
“EFA is generally considered to be more of a theory-generating procedure than a theory-testing procedure. In contrast, confirmatory factor analysis (CFA) is generally based on a strong theoretical and/or empirical foundation that allows the researcher to specify an exact factor model in advance.” EFA in Hypothesis Testing
With immense use in various fields in real life, this segment presents a list of applications of Factor Analysis and the way FA is used in day-to-day operations.
Applications of factor analysis
Marketing is defined as the act of promoting a good or a service or even a brand. When it comes to Factor Analysis in marketing, one can benefit immensely from this statistical method.
In order to boost marketing campaigns and accelerate success, in the long run, companies employ Factor Analysis techniques that help to find a correlation between various variables or factors of a marketing campaign.
Moreover, FA also helps to establish connections with customer satisfaction and consequent feedback after a marketing campaign in order to check its efficacy and impact on the audiences.
That said, the realm of marketing can largely benefit from Factor Analysis and trigger sales with respect to much-enhanced feedback and customer satisfaction reports.
(Must read: Marketing management guide )
In data mining, Factor Analysis can play a role as important as that of artificial intelligence. Owing to its ability to transform a complex and vast dataset into a group of filtered out variables that are related to each other in some way or the other, FA eases out the process of data mining.
For data scientists, the tedious task of finding relationships and establishing correlation among various variables has always been full of obstacles and errors.
However, with the help of this statistical method, data mining has become much more advanced.
(Also read: Data mining tools )
Machine Learning and data mining tools go hand in hand. Perhaps this is the reason why Factor Analysis finds a place among Machine Learning tools and techniques.
As Factor Analysis in machine learning helps in reducing the number of variables in a given dataset to procure a more accurate and enhanced set of observed factors, various machine learning algorithms are put to use to work accordingly.
They are trained well with humongous data to rightly work in order to give way to other applications. An unsupervised machine learning algorithm, FA is largely used for dimensionality reduction in machine learning.
Thereby, machine learning can very well collaborate with Factor Analysis to give rise to data mining techniques and make the task of data research massively efficient.
(Recommended blog: Data mining software )
Nutritional Science is a prominent field of work in the contemporary scenario. By focusing on the dietary practices of a given population, Factor Analysis helps to establish a relationship between the consumption of nutrients in an adult’s diet and the nutritional health of that person.
Furthermore, an individual’s nutrient intake and consequent health status have helped nutritionists to compute the appropriate quantity of nutrients one should intake in a given period of time.
The application of Factor Analysis in business is rather surprising and satisfactory.
Remember the times when business firms had to employ professionals to dig out patterns from past records in order to lay a road ahead for strategic business plans?
Well, gone are the days when so much work had to be done. Thanks to Factor Analysis, the world of business can use it for eliminating the guesswork and formulating more accurate and straightforward decisions in various aspects like budgeting, marketing, production, and transport.
Having learned about Factor Analysis in detail, let us now move on to looking closely into the pros and cons of this statistical method.
Measurable attributes.
The first and foremost pro of FA is that it is open to all measurable attributes. Be it subjective or objective, any kind of attribute can be worked upon when it comes to this statistical technique.
Unlike some statistical models that only work on objective attributes, Factor Analysis goes well with both subjective and objective attributes.
While data research and data mining algorithms can cost a lot due to the extraordinary charges, this statistical model is surprisingly cost-effective and does not take many resources to work with.
That said, it can be incorporated by any beginner or an experienced professional in light of its cost-effective and easy approach towards data mining and data reduction.
While many machine learning algorithms are rigid and constricted to a single approach, Factor Analysis does not work that way.
Rather, this statistical model has a flexible approach towards multivariate datasets that let one obtain relationships or correlations between various variables and their underlying components.
(Must read: AI algorithms )
Incomprehensive results.
While there are many pros of Factor Analysis, there are various cons of this method as well. Primarily, Factor Analysis can procure incompetent results due to incomprehensive datasets.
While various data points can have similar traits, some other variables or factors can go unnoticed due to being isolated in a vast dataset. That said, the results of this method could be incomprehensive.
Another drawback of Factor Analysis is that it does not identify complicated factors that underlie a dataset.
While some results could clearly indicate a correlation between two variables, some complicated correlations can go unnoticed in such a method.
Perhaps the non-identification of complicated factors and their relationships could be an issue for data research.
Even though Factor Analysis skills can be imitated by machine learning algorithms, this method is still reliant on theory and thereby data researchers.
While many components of a dataset can be handled by a computer, some other details are required to be looked into by humans.
Thus, one of the major drawbacks of Factor Analysis is that it is somehow reliant on theory and cannot fully function without manual assistance.
(Suggested reading: Deep learning algorithms )
To sum up, Factor Analysis is an extensive statistical model that is used to reduce dimensions of a given dataset with the help of condensing observed variables in a smaller size.
(Top reading: Statistical data distribution models )
By arranging observed variables in groups of super-variables, Factor Analysis has immensely impacted the way data mining is done. With numerous fields relying on this technique for better performance, FA is the need of the hour.
Be a part of our Instagram community
5 Factors Influencing Consumer Behavior
Elasticity of Demand and its Types
An Overview of Descriptive Analysis
What is PESTLE Analysis? Everything you need to know about it
What is Managerial Economics? Definition, Types, Nature, Principles, and Scope
5 Factors Affecting the Price Elasticity of Demand (PED)
6 Major Branches of Artificial Intelligence (AI)
Scope of Managerial Economics
Dijkstra’s Algorithm: The Shortest Path Algorithm
Different Types of Research Methods
Introduction to Factor Analysis
Factor analysis is a sophisticated statistical method aimed at reducing a large number of variables into a smaller set of factors. This technique is valuable for extracting the maximum common variance from all variables, transforming them into a single score for further analysis. As a part of the general linear model (GLM), factor analysis is predicated on certain key assumptions such as linearity, absence of multicollinearity, inclusion of relevant variables, and a true correlation between variables and factors.
Principal Methods of Factor Extraction
Principal Component Analysis (PCA) :
PCA is the most widely used technique. It begins by extracting the maximum variance, assigning it to the first factor. Subsequent factors are determined by removing variance accounted for by earlier factors and extracting the maximum variance from what remains. This sequential process continues until all factors are identified.
Common Factor Analysis :
Preferred for structural equation modeling (SEM), this method focuses on extracting common variance among variables, excluding unique variances. It’s particularly useful for understanding underlying relationships that may not be immediately apparent from the observed variables.
Image Factoring :
Based on a correlation matrix, image factoring uses ordinary least squares regression to predict factors, making it distinct in its approach to factor extraction.
Maximum Likelihood Method :
This technique utilizes the maximum likelihood estimation approach to factor analysis, working from the correlation matrix to derive factors.
Other Methods :
Including Alpha factoring and weighted least squares, these methods provide alternatives that may be suitable depending on the specific characteristics of the data set.
Factor Loadings and Their Interpretation
Factor loadings play a crucial role in factor analysis, representing the correlation between the variable and the factor. A factor loading of 0.7 or higher typically indicates that the factor sufficiently captures the variance of that variable. These loadings help in determining the importance and contribution of each variable to a factor.
Schedule a time to speak with an expert using the calendar below.
User-friendly Software
Transform raw data to written interpreted APA formatted results in seconds.
Eigenvalues and Factor Scores
Determining the Number of Factors
The number of factors to retain can be determined by several criteria:
Rotation Techniques to Enhance Interpretability
Rotations in factor analysis, whether orthogonal like Varimax and Quartimax or oblique like Direct Oblimin and Promax, help in achieving a simpler, more interpretable factor structure. These methods adjust the axes on which factors are plotted to maximize the distinction between factors and improve the clarity of the results.
Assumptions and Data Requirements
Factor analysis is a powerful tool for data reduction and interpretation, enabling researchers to uncover underlying dimensions or factors that explain patterns in complex data sets. By adhering to its assumptions and appropriately choosing factor extraction and rotation methods, researchers can effectively use factor analysis to simplify data, construct scales, and enhance the validity of their studies.
Bryant, F. B., & Yarnold, P. R. (1995). Principal components analysis and exploratory and confirmatory factor analysis. In L. G. Grimm & P. R. Yarnold (Eds.), Reading and understanding multivariate analysis . Washington, DC: American Psychological Association.
Dunteman, G. H. (1989). Principal components analysis . Newbury Park, CA: Sage Publications.
Fabrigar, L. R., Wegener, D. T., MacCallum, R. C., & Strahan, E. J. (1999). Evaluating the use of exploratory factor analysis in psychological research. Psychological Methods, 4 (3), 272-299.
Gorsuch, R. L. (1983). Factor Analysis . Hillsdale, NJ: Lawrence Erlbaum Associates.
Hair, J. F., Jr., Anderson, R. E., Tatham, R. L., & Black, W. C. (1995). Multivariate data analysis with readings (4th ed.). Upper Saddle River, NJ: Prentice-Hall.
Hatcher, L. (1994). A step-by-step approach to using the SAS system for factor analysis and structural equation modeling . Cary, NC: SAS Institute.
Hutcheson, G., & Sofroniou, N. (1999). The multivariate social scientist: Introductory statistics using generalized linear models . Thousand Oaks, CA: Sage Publications.
Kim, J. -O., & Mueller, C. W. (1978a). Introduction to factor analysis: What it is and how to do it . Newbury Park, CA: Sage Publications.
Kim, J. -O., & Mueller, C. W. (1978b). Factor Analysis: Statistical methods and practical issues . Newbury Park, CA: Sage Publications.
Lawley, D. N., & Maxwell, A. E. (1962). Factor analysis as a statistical method. The Statistician, 12 (3), 209-229.
Levine, M. S. (1977). Canonical analysis and factor comparison . Newbury Park, CA: Sage Publications.
Pett, M. A., Lackey, N. R., & Sullivan, J. J. (2003). Making sense of factor analysis: The use of factor analysis for instrument development in health care research . Thousand Oaks, CA: Sage Publications.
Shapiro, S. E., Lasarev, M. R., & McCauley, L. (2002). Factor analysis of Gulf War illness: What does it add to our understanding of possible health effects of deployment, American Journal of Epidemiology, 156 , 578-585.
Velicer, W. F., Eaton, C. A., & Fava, J. L. (2000). Construct explication through factor or component analysis: A review and evaluation of alternative procedures for determining the number of factors or components. In R. D. Goffin & E. Helmes (Eds.), Problems and solutions in human assessment: Honoring Douglas Jackson at seventy. Boston, MA: Kluwer.
Widaman, K. F. (1993). Common factor analysis versus principal component analysis: Differential bias in representing model parameters, Multivariate Behavioral Research, 28 , 263-311.
Related Pages:
Statistics Solutions can assist with your quantitative analysis by assisting you to develop your methodology and results chapters. The services that we offer include:
Data Analysis Plan
Quantitative Results Section (Descriptive Statistics, Bivariate and Multivariate Analyses, Structural Equation Modeling , Path analysis, HLM, Cluster Analysis )
Clean and code dataset
Please call 727-442-4290 to request a quote based on the specifics of your research, schedule using the calendar on this page, or email [email protected]
Factor analysis is a powerful data reduction technique that enables researchers to investigate concepts that cannot easily be measured directly. By boiling down a large number of variables into a handful of comprehensible underlying factors, factor analysis results in easy-to-understand, actionable data.
By applying this method to your research, you can spot trends faster and see themes throughout your datasets, enabling you to learn what the data points have in common.
Unlike statistical methods such as regression analysis , factor analysis does not require defined variables.
Factor analysis is most commonly used to identify the relationship between all of the variables included in a given dataset.
Think of factor analysis as shrink wrap. When applied to a large amount of data, it compresses the set into a smaller set that is far more manageable, and easier to understand.
Determining when to use particular statistical methods to get the most insight out of your data can be tricky.
When considering factor analysis, have your goal top-of-mind.
There are three main forms of factor analysis. If your goal aligns to any of these forms, then you should choose factor analysis as your statistical method of choice:
Exploratory Factor Analysi s should be used when you need to develop a hypothesis about a relationship between variables.
Confirmatory Factor Analysis should be used to test a hypothesis about the relationship between variables.
Construct Validity should be used to test the degree to which your survey actually measures what it is intended to measure.
If you know that you’ll want to perform a factor analysis on response data from a survey, there are a few things you can do ahead of time to ensure that your analysis will be straightforward, informative, and actionable.
Large datasets are the lifeblood of factor analysis. You’ll need large groups of survey respondents, often found through panel services , for factor analysis to yield significant results.
While variables such as population size and your topic of interest will influence how many respondents you need, it’s best to maintain a “more respondents the better” mindset.
While designing your survey , load in as many specific questions as possible. Factor analysis will fall flat if your survey only has a few broad questions.
The ultimate goal of factor analysis is to take a broad concept and simplify it by considering more granular, contextual information, so this approach will provide you the results you’re looking for.
If you’re looking to perform a factor analysis, you’ll want to avoid having open-ended survey questions .
By providing answer options in the form of scales (whether they be Likert Scales , numerical scales, or even ‘yes/no’ scales) you’ll save yourself a world of trouble when you begin conducting your factor analysis. Just make sure that you’re using the same scaled answer options as often as possible.
See all blog posts >
This website uses cookies so that we can provide you with the best user experience possible. Cookie information is stored in your browser and performs functions such as recognising you when you return to our website and helping our team to understand which sections of the website you find most interesting and useful.
Strictly Necessary Cookie should be enabled at all times so that we can save your preferences for cookie settings.
If you disable this cookie, we will not be able to save your preferences. This means that every time you visit this website you will need to enable or disable cookies again.
This website uses Google Analytics to collect anonymous information such as the number of visitors to the site, and the most popular pages.
Keeping this cookie enabled helps us to improve our website.
Please enable Strictly Necessary Cookies first so that we can save your preferences!
Factor analysis is a technique in mathematics that we use to reduce a larger number into a smaller number. Moreover, in this topic, we will talk about it and its various aspects.
It refers to a method that reduces a large variable into a smaller variable factor. Furthermore, this technique takes out maximum ordinary variance from all the variables and put them in common score.
Moreover, it is a part of General Linear Model (GLM) and it believes several theories that contain no multicollinearity, linear relationship, true correlation , and relevant variables into the analysis among factors and variables.
There are different methods that we use in factor analysis from the data set:
It is the most common method which the researchers use. Also, it extracts the maximum variance and put them into the first factor. Subsequently, it removes the variance explained by the first factor and extracts the second factor. Moreover, it goes on until the last factor.
It’s the second most favoured technique by researchers. Also, it extracts common variance and put them into factors . Furthermore, this technique doesn’t include the variance of all variables and is used in SEM.
It is on the basis of the correlation matrix and makes use of OLS regression technique in order to predict the factor in image factoring.
It also works on the correlation matrix but uses a maximum likelihood method to factor.
Alfa factoring outweighs least squares. Weight square is another regression-based method that we use for factoring.
Factor loading- Basically it the correlation coefficient for the factors and variables. Also, it explains the variable on a particular factor shown by variance.
Eigenvalues- Characteristics roots are its other name. Moreover, it explains the variance shown by that particular factor out of the total variance. Furthermore, commonality column helps to know how much variance the first factor explained out of total variance.
Factor Score- It’s another name is the component score. Besides, it’s the score of all rows and columns that we can use as an index for all variables and for further analysis. Moreover, we can standardize it by multiplying it with a common term.
Rotation method- This method makes it more reliable to understand the output. Also, it affects the eigenvalues method but the eigenvalues method doesn’t affect it. Besides, there are 5 rotation methods: (1) No Rotation Method, (2) Varimax Rotation Method, (3) Quartimax Rotation Method, (4) Direct Oblimin Rotation Method, and (5) Promax Rotation Method.
Factor analysis has several assumptions. These include:
It includes the following key concept:
Exploratory factor analysis- It assumes that any variable or indicator can be associated with any factor. Moreover, it is the most common method used by researchers. Furthermore, it isn’t based on any prior theory.
Confirmatory Factor Analysis- It is used to determine the factors loading and factors of measured variables, and to confirm what it expects on the basis of pre-established assumption. Besides, it uses two approaches:
Question. How many types of Factor analysis are there?
A. 5 B. 6 C. 4 D. 3
Answer . The correct answer is option A.
Which class are you in.
Your email address will not be published. Required fields are marked *
You are accessing a machine-readable page. In order to be human-readable, please install an RSS reader.
All articles published by MDPI are made immediately available worldwide under an open access license. No special permission is required to reuse all or part of the article published by MDPI, including figures and tables. For articles published under an open access Creative Common CC BY license, any part of the article may be reused without permission provided that the original article is clearly cited. For more information, please refer to https://www.mdpi.com/openaccess .
Feature papers represent the most advanced research with significant potential for high impact in the field. A Feature Paper should be a substantial original Article that involves several techniques or approaches, provides an outlook for future research directions and describes possible research applications.
Feature papers are submitted upon individual invitation or recommendation by the scientific editors and must receive positive feedback from the reviewers.
Editor’s Choice articles are based on recommendations by the scientific editors of MDPI journals from around the world. Editors select a small number of articles recently published in the journal that they believe will be particularly interesting to readers, or important in the respective research area. The aim is to provide a snapshot of some of the most exciting work published in the various research areas of the journal.
Original Submission Date Received: .
Find support for a specific problem in the support section of our website.
Please let us know what you think of our products and services.
Visit our dedicated information section to learn more about MDPI.
Bayesian statistical inference for factor analysis models with clustered data.
Chen, B.; He, N.; Li, X. Bayesian Statistical Inference for Factor Analysis Models with Clustered Data. Mathematics 2024 , 12 , 1949. https://doi.org/10.3390/math12131949
Chen B, He N, Li X. Bayesian Statistical Inference for Factor Analysis Models with Clustered Data. Mathematics . 2024; 12(13):1949. https://doi.org/10.3390/math12131949
Chen, Bowen, Na He, and Xingping Li. 2024. "Bayesian Statistical Inference for Factor Analysis Models with Clustered Data" Mathematics 12, no. 13: 1949. https://doi.org/10.3390/math12131949
Article access statistics, further information, mdpi initiatives, follow mdpi.
Subscribe to receive issue release notifications and newsletters from MDPI journals
Higher Education News , Tips for Online Students , Tips for Students
Updated: June 19, 2024
Published: June 15, 2024
When embarking on a research project, selecting the right methodology can be the difference between success and failure. With various methods available, each suited to different types of research, it’s essential you make an informed choice. This blog post will provide tips on how to choose a research methodology that best fits your research goals .
We’ll start with definitions: Research is the systematic process of exploring, investigating, and discovering new information or validating existing knowledge. It involves defining questions, collecting data, analyzing results, and drawing conclusions.
Meanwhile, a research methodology is a structured plan that outlines how your research is to be conducted. A complete methodology should detail the strategies, processes, and techniques you plan to use for your data collection and analysis.
The first step of a research methodology is to identify a focused research topic, which is the question you seek to answer. By setting clear boundaries on the scope of your research, you can concentrate on specific aspects of a problem without being overwhelmed by information. This will produce more accurate findings.
Along with clarifying your research topic, your methodology should also address your research methods. Let’s look at the four main types of research: descriptive, correlational, experimental, and diagnostic.
Descriptive research is an approach designed to describe the characteristics of a population systematically and accurately. This method focuses on answering “what” questions by providing detailed observations about the subject. Descriptive research employs surveys, observational studies , and case studies to gather qualitative or quantitative data.
A real-world example of descriptive research is a survey investigating consumer behavior toward a competitor’s product. By analyzing the survey results, the company can gather detailed insights into how consumers perceive a competitor’s product, which can inform their marketing strategies and product development.
Correlational research examines the statistical relationship between two or more variables to determine whether a relationship exists. Correlational research is particularly useful when ethical or practical constraints prevent experimental manipulation. It is often employed in fields such as psychology, education, and health sciences to provide insights into complex real-world interactions, helping to develop theories and inform further experimental research.
An example of correlational research is the study of the relationship between smoking and lung cancer. Researchers observe and collect data on individuals’ smoking habits and the incidence of lung cancer to determine if there is a correlation between the two variables. This type of research helps identify patterns and relationships, indicating whether increased smoking is associated with higher rates of lung cancer.
Experimental research is a scientific approach where researchers manipulate one or more independent variables to observe their effect on a dependent variable. This method is designed to establish cause-and-effect relationships. Fields like psychology , medicine, and social sciences frequently employ experimental research to test hypotheses and theories under controlled conditions.
A real-world example of experimental research is Pavlov’s Dog experiment. In this experiment, Ivan Pavlov demonstrated classical conditioning by ringing a bell each time he fed his dogs. After repeating this process multiple times, the dogs began to salivate just by hearing the bell, even when no food was presented. This experiment helped to illustrate how certain stimuli can elicit specific responses through associative learning.
Diagnostic research tries to accurately diagnose a problem by identifying its underlying causes. This type of research is crucial for understanding complex situations where a precise diagnosis is necessary for formulating effective solutions. It involves methods such as case studies and data analysis and often integrates both qualitative and quantitative data to provide a comprehensive view of the issue at hand.
An example of diagnostic research is studying the causes of a specific illness outbreak. During an outbreak of a respiratory virus, researchers might conduct diagnostic research to determine the factors contributing to the spread of the virus. This could involve analyzing patient data, testing environmental samples, and evaluating potential sources of infection. The goal is to identify the root causes and contributing factors to develop effective containment and prevention strategies.
Using an established research method is imperative, no matter if you are researching for marketing , technology , healthcare , engineering, or social science. A methodology lends legitimacy to your research by ensuring your data is both consistent and credible. A well-defined methodology also enhances the reliability and validity of the research findings, which is crucial for drawing accurate and meaningful conclusions.
Additionally, methodologies help researchers stay focused and on track, limiting the scope of the study to relevant questions and objectives. This not only improves the quality of the research but also ensures that the study can be replicated and verified by other researchers, further solidifying its scientific value.
Choosing the best research methodology for your project involves several key steps to ensure that your approach aligns with your research goals and questions. Here’s a simplified guide to help you make the best choice.
Clearly define the objectives of your research. What do you aim to discover, prove, or understand? Understanding your goals helps in selecting a methodology that aligns with your research purpose.
Determine whether your research will involve numerical data, textual data, or both. Quantitative methods are best for numerical data, while qualitative methods are suitable for textual or thematic data.
Becoming familiar with the four types of research – descriptive, correlational, experimental, and diagnostic – will enable you to select the most appropriate method for your research. Many times, you will want to use a combination of methods to gather meaningful data.
Consider the resources available to you, including time, budget, and access to data. Some methodologies may require more resources or longer timeframes to implement effectively.
Look at previous research in your field to see which methodologies were successful. This can provide insights and help you choose a proven approach.
By following these steps, you can select a research methodology that best fits your project’s requirements and ensures robust, credible results.
Upon completing your research, the next critical step is to analyze and interpret the data you’ve collected. This involves summarizing the key findings, identifying patterns, and determining how these results address your initial research questions. By thoroughly examining the data, you can draw meaningful conclusions that contribute to the body of knowledge in your field.
It’s essential that you present these findings clearly and concisely, using charts, graphs, and tables to enhance comprehension. Furthermore, discuss the implications of your results, any limitations encountered during the study, and how your findings align with or challenge existing theories.
Your research project should conclude with a strong statement that encapsulates the essence of your research and its broader impact. This final section should leave readers with a clear understanding of the value of your work and inspire continued exploration and discussion in the field.
Now that you know how to perform quality research , it’s time to get started! Applying the right research methodologies can make a significant difference in the accuracy and reliability of your findings. Remember, the key to successful research is not just in collecting data, but in analyzing it thoughtfully and systematically to draw meaningful conclusions. So, dive in, explore, and contribute to the ever-growing body of knowledge with confidence. Happy researching!
At UoPeople, our blog writers are thinkers, researchers, and experts dedicated to curating articles relevant to our mission: making higher education accessible to everyone.
You might be using an unsupported or outdated browser. To get the best possible experience please use the latest version of Chrome, Firefox, Safari, or Microsoft Edge to view this website. |
Updated: Mar 8, 2023, 12:37pm
Key takeaways, communication tools used in the workplace in 2023, how covid-19 continues to affect work communication, the majority of workers use digital communication tools for up to 20 hours a week, digital communication tools are affecting work-life balance, how ineffective communication affects the work environment, digital communication tools are increasing stress in the workplace, most workers prefer email to other digital communication options, how workers are using digital communication to connect, how many people still work from home in each state, methodology.
With work from home increasing to 58% of the workforce (92 million workers), digital communication has become a focal point of workplace communication and productivity. Following an analysis, Forbes Advisor found that Colorado and Maryland had the highest number of remote workers. The survey also found that 28% of all respondents report using a voice-over-internet-protocol (VoIP) phone system . While half of the respondents we surveyed worked in a hybrid environment, 27% worked remotely and 20% on-site.
The days of the phone call may not be behind us, despite how many other communication platforms there are today. Workers are finding that the more effective communication platforms range in the type of communication they provide, whether that be instant messaging, video calls or VoIP systems. Google Meet and Zoom ranked highest for video calls, being used by 40% and 46% of respondents, respectively.
Remote and hybrid workers are using VoIP systems to communicate more often than in-office workers. VoIP systems were used by over a quarter of total respondents, with 37% of remote workers using them, 23% of on-site workers and 24% of hybrid workers.
The most effective communication tool varied between on-site, remote and hybrid workers. For on-site workers, the mobile phone was the most effective method of communication for 38% of respondents, followed by landline (22%) and Zoom (21%). For people working remotely, Zoom was the most effective method for 22% of respondents, as well as Google Chat (also 22%). Hybrid workers followed a similar trend: 31% ranked Zoom as the most effective and 23% ranked Google Meet as the most effective.
Most people turn to tools beyond the standard phone to communicate at work, with 14% of respondents using VoIP when they didn’t prior to the pandemic. Over 20% of them are remote workers. It may seem obvious that more people began using Zoom (24% of respondents), but mobile phones also saw a spike in use by 20% after March 1, 2020.
While Covid-19 changed the way offices and teams communicate, it didn’t necessarily lead to workers feeling less connected across the board. A total of 45% of workers who took the survey actually felt more connected to their team after Covid-19 (43% of on-site, 52% remote and 46% hybrid workers).
Some workers did feel less connected (25%). Remote workers were the most likely to report feeling less connected (34%) while the numbers were lower for on-site workers (27%) and hybrid workers (20%). There were also those who experienced no change. Of these respondents, on-site workers were the most likely to report no change (28%).
Many workers spend all day in front of a screen. The highest percentage of respondents (16%) said they spend 21 to 25 hours per week on digital communication platforms. That’s around five hours per day on average.
Fifteen percent spent 16 to 20 hours, 14% spent 11 to 15 hours and 12% spent six to 10 hours. There was a sharp decrease when the numbers reached 31 to 35 hours: only 5% said they spent this much time on digital communication tools. Digital communication tools took up the use of more than a 40-hour workweek for 2% of respondents.
With so many digital communication tools available, more workers are feeling pressure to stay connected to their coworkers outside of normal working hours. Nearly 25% of workers said that they always feel pressured to stay connected to their peers, while 35% said they often feel pressure. On the other end—those who felt free from pressure—the numbers were much smaller. Seven percent said they rarely felt pressure while 10% said they never do.
Whether working from home, on-site or both, digital communication has a high chance of increasing feelings of burnout. Our survey showed that 60% of respondents said that digital communication increased feelings of burnout. Nearly 70% of remote workers said they experienced burnout from digital communication. Hybrid and on-site workers were less likely to experience burnout as a result of digital communication: 56% and 49% respectively.
Only 11% of workers report that ineffective communication does not have any effect on them. For the rest of the respondents, poor communication greatly affected workers in many areas. Most notably, it impacted productivity for 49% of respondents. Nearly 50% of respondents reported that ineffective communication impacted job satisfaction while 42% said it affected stress levels.
For over 40% of workers, poor communication reduces trust both in leadership and in their team. Remote workers were more affected, with 54% reporting poor communication impacts trust in leadership and 52% reporting it impacts trust in the team. For on-site workers, poor communication did not impact trust to the same extent, though it still had a big impact: 43% reported trust in leadership was impacted and 38% said trust in their team was affected.
Respondents reported that effective communication impacted several areas of work. Forty-two percent said it impacted cross-functional collaboration. Job satisfaction is another big area that is affected by communication: 48% said they were impacted. Nearly half of the respondents said their productivity was impacted.
For 46% of respondents, seeing messages ignored for long periods of time led to stress in the workplace. The notification that their manager is typing a message caused stress for 45% of respondents. Many other aspects of digital communication led to stress as well: crafting digital responses with the right tone of voice (42%), deciphering the tone behind digital messages (38%), last-minute video calls from leadership (36%) and turning off your camera when on video calls (35%).
When it comes to preferred methods of communication, many workers prefer old-fashioned tools. Email is the most popular tool, with 18% of total respondents marking it as their preference (25% of remote workers and 10% of on-site workers). Video calls were the next popular choice (17%) followed by direct messages (16%). For on-site workers, in-person conversations were by far the most preferred method of communication, with 34% of respondents saying it’s their preference.
For many workers, digital communication is an essential part of their day, but they differ in the methods of communication they use. More than half (56%) of respondents use video for their communication and 55% use audio. Personalized greetings are less common (44%). Emojis and GIFs are still relatively common forms of communication: 42% and 34% respectively.
Forbes Advisor found the total number of people working from home in each state in 2023. The survey found that the percentage of remote workers varied by state. Between 20% and 24.2% of people work from home in the 11 states with the largest work-from-home workforce.
While much has changed in the world of digital communication since Covid-19, there have also been constants. Email and phone are still two of the most preferred methods of communication, despite the numerous options and tools available. VoIP systems are increasing in popularity as well, with 28% of all respondents using them. Workers are spending an average of 20 hours per week on digital communication platforms—that’s half the 40-hour workweek.
Looking ahead, it will be important for teams and small businesses to establish productive systems of digital communication, especially given that over half of the people we surveyed reported that digital communication leads to increased burnout.
If a company or team establishes a healthy culture around digital communication, it can potentially lead to better job satisfaction, increased productivity and higher trust in a company’s leadership and the team.
Forbes Advisor commissioned a survey of 1,000 employed Americans who work in an office setting by market research company OnePoll, in accordance with the Market Research Society’s code of conduct. The margin of error is +/- 3.1 points with 95% confidence. The OnePoll research team is a member of the MRS and has corporate membership with the American Association for Public Opinion Research (AAPOR).
To find the number of workers in each state who work from home, Forbes Advisor sourced data from the Census Bureau’s American Community Survey .
Leeron is a New York-based writer with experience covering technology and politics. Her work has appeared in publications such as Quartz, the Village Voice, Gothamist, and Slate.
The Journal of Headache and Pain volume 25 , Article number: 100 ( 2024 ) Cite this article
450 Accesses
1 Altmetric
Metrics details
Currently, the treatment and prevention of migraine remain highly challenging. Mendelian randomization (MR) has been widely used to explore novel therapeutic targets. Therefore, we performed a systematic druggable genome-wide MR to explore the potential therapeutic targets for migraine.
We obtained data on druggable genes and screened for genes within brain expression quantitative trait locis (eQTLs) and blood eQTLs, which were then subjected to two-sample MR analysis and colocalization analysis with migraine genome-wide association studies data to identify genes highly associated with migraine. In addition, phenome-wide research, enrichment analysis, protein network construction, drug prediction, and molecular docking were performed to provide valuable guidance for the development of more effective and targeted therapeutic drugs.
We identified 21 druggable genes significantly associated with migraine (BRPF3, CBFB, CDK4, CHD4, DDIT4, EP300, EPHA5, FGFRL1, FXN, HMGCR, HVCN1, KCNK5, MRGPRE, NLGN2, NR1D1, PLXNB1, TGFB1, TGFB3, THRA, TLN1 and TP53), two of which were significant in both blood and brain (HMGCR and TGFB3). The results of phenome-wide research showed that HMGCR was highly correlated with low-density lipoprotein, and TGFB3 was primarily associated with insulin-like growth factor 1 levels.
This study utilized MR and colocalization analysis to identify 21 potential drug targets for migraine, two of which were significant in both blood and brain. These findings provide promising leads for more effective migraine treatments, potentially reducing drug development costs.
Peer Review reports
Migraine is a prevalent chronic disease characterized by recurring headaches that are typically unilateral and throbbing, ranging from moderate to severe intensity, and often accompanied by nausea, vomiting, sensitivity to light, among other symptoms [ 1 ]. Migraine is recognized as the second most disabling condition globally, creating substantial challenges for those affected and also placing a considerable strain on society overall [ 2 ]. Genetic factors play a substantial role in migraine, with its heritability estimated to be as high as 57% [ 3 ].
Currently, the treatment and prevention of migraine remain highly challenging. Although new drugs (e.g. targeting the calcitonin gene-related peptide, namely CGRP) have been developed, offering significant benefits to migraine sufferers, there are still many issues, such as side effects and less than ideal response rates [ 4 ]. Therefore, it is necessary to continue exploring potential therapeutic targets for migraine treatment. Integrating genetics into drug development may provide a novel approach. While genome-wide association studies (GWAS) are very effective in identifying single nucleotide polymorphisms (SNPs) associated with the risk of migraine [ 5 ], the GWAS method does not clearly and directly identify the causative genes or drive drug development without substantial downstream analyses [ 6 , 7 ].
Mendelian randomization (MR) is a method that utilizes genetic variation as instrumental variables (IVs) to uncover a causal connection between an exposure and an outcome [ 8 ]. MR analysis has been widely applied to discover new therapeutic targets by integrating summarized data from disease GWAS and expression quantitative trait loci (eQTL) studies [ 9 ]. The eQTLs found in the genomic regions of druggable genes are always considered as proxies, since the expression levels of gene can be seen as a form of lifelong exposure. Therefore, we performed a systematic druggable genome-wide MR to explore the potential therapeutic targets for migraine. First, we obtained data on druggable genes and screened for genes within brain eQTLs and blood eQTLs, which were then subjected to two-sample MR analysis with migraine GWAS data to identify genes highly associated with migraine. Subsequently, we conducted colocalization analysis to ensure the robustness of our results. For significant genes both in blood and brain, the phenome-wide research was conducted to explore the relationship between shared potential therapeutic targets and other characteristics. In addition, enrichment analysis, protein network construction, drug prediction, and molecular docking were performed for all significant genes to provide valuable guidance for the development of more effective and targeted therapeutic drugs.
The overview of this study is presented in Fig. 1 .
Overview of this study design. DGIdb: Drug-Gene Interaction Database; eQTL: expression quantitative trait loci; GWAS: genome-wide association studies; PheWAS: Phenome-wide association study; PPI: protein–protein interaction; DSigDB: Drug Signatures Database
Druggable genes were sourced from the Drug-Gene Interaction Database (DGIdb, https://www.dgidb.org/ ) [ 10 ] and a comprehensive review [ 11 ]. The DGIdb offers insights into drug-gene interactions and the potential for druggability. We accessed the 'Categories Data' from DGIdb, which was updated in February 2022. Additionally, we utilized a list of druggable genes provided in a review authored by Finan et al. [ 11 ]. By consolidating druggable genes from two sources, a broader range of druggable genes can be obtained, which have already been applied in previous study [ 12 ].
The blood eQTL dataset was sourced from eQTLGen ( https://eqtlgen.org/ ) [ 13 ], which provided cis-eQTLs for 16,987 genes derived from 31,684 blood samples collected from healthy individuals of European ancestry (Table 1 ). We acquired cis-eQTL results that were fully significant (with a false discovery rate (FDR) less than 0.05) along with information on allele frequencies. We obtained the brain eQTL data from the PsychENCODE consortia ( http://resource.psychencode.org ) [ 14 ], encompassing 1,387 samples from the prefrontal cortex, primarily of European descent (Table 1 ). We downloaded all significant eQTLs (with FDR less than 0.05) for genes that exhibited an expression level greater than 0.1 fragments per kilobase per million mapped fragments in at least 10 samples, along with complete SNP information.
In this study, the summary statistics data for migraine were obtained from a meta-analysis of GWAS conducted by the International Headache Genetics Consortium (IHGC) in 2022 [ 5 ]. To address privacy concerns related to participants in the 23andMe cohort, the GWAS summary statistics data used in this study did not include samples from the 23andMe cohort. The summary data comprised 589,356 individuals of European ancestry, with 48,975 cases and 540,381 controls (Table 1 ).
MR analyses were conducted using the 'TwoSampleMR' package (version 0.5.7) [ 15 ] in R. We chose the eQTLs of the drug genome as the exposure data. For constructing IVs, SNPs with a FDR below 0.05 and located within ± 100 kb of the transcriptional start site (TSS) of each gene were selected. These SNPs were subsequently clumped at an r 2 less than 0.001 using European samples from the 1000 Genomes Project [ 16 ]. The R package 'phenoscanner' [ 17 ] (version 1.0) was employed to identify phenotypes related to the IVs. Additionally, we excluded SNPs that were directly associated with migraine and the trait directly linked to migraine, namely headache. We harmonised and conducted MR analyses on the filtered SNPs. When only one SNP was available for analysis, we use the Wald ratio method to perform MR estimation. When multiple SNPs were available, MR analysis was performed using the inverse-variance weighted (IVW) method with random effects [ 18 ]. We used Cochran's Q test to assess heterogeneity among the individual causal effects of the SNPs [ 19 ]. Additionally, MR Egger's intercept was utilized to evaluate SNP pleiotropy [ 20 ]. P -values were adjusted by FDR, and 0.05 was considered as the significant threshold. Additionally, we selected target genes associated with commonly used medications for migraine and compared their MR results with those of significantly druggable genes.
Sometimes, a single SNP is located in the regions of two or more genes. In such cases, its impact on a disease (here, migraine) is influenced by a mix of different genes. Colocalization analysis was used to confirm the potential shared causal genetic variations in physical location between migraine and eQTLs. We separately filtered SNPs located within ± 100 kb from each migraine risk gene's TSS from migraine GWAS data, blood eQTL data, and brain eQTL data. The probability that a given SNP is associated with migraine is denoted as P1, the probability that a given SNP is a significant eQTL is denoted as P2, and the probability that a given SNP is both associated with migraine and is an eQTL result is denoted as P12. All probabilities were set to default values (P1 = 1 × 10 −4 , P2 = 1 × 10 −4 , and P12 = 1 × 10 −5 ) [ 21 ]. We used posterior probabilities (PP) to quantify the support for all hypotheses, which are identified as PPH0 through PPH4: PPH0, not associated with any trait; PPH1, related to gene expression but not associated with migraine risk; PPH2, associated with migraine risk but not related to gene expression; PPH3, associated with both migraine risk and gene expression, with clear causal variation; and PPH4, associated with both migraine risk and gene expression, with a common causal variant. Given the limited capacity of colocalization analysis, we restricted our subsequent analyses to genes where PPH4 was greater than or equal to 0.75. Colocalization analysis was conducted using the R package 'coloc' (version 5.2.3).
We used the IEU OpenGWAS Project ( https://gwas.mrcieu.ac.uk/phewas/ ) [ 15 ] to obtain the phenome-wide association study (PheWAS) data of SNPs corresponding to druggable genes that were significant in both blood and brain following colocalization analysis.
To explore the functionals' characteristics and biological relevance of predetermined prospective druggable genes, the R package 'clusterProfiler' (version 4.10.1) [ 22 ] was used for Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) enrichment studies. GO includes three terms: Biological Process (BP), Molecular Function (MF), and Cellular Component (CC). KEGG pathways can provide information about metabolic pathways.
The protein–protein interaction (PPI) networks can visually display the relationships between protein interactions of significant druggable genes. We constructed PPI networks using the STRING ( https://string-db.org/ ) s' with a confidence score threshold of 0.4 as the minimum required interaction score, while all other parameters were maintained at their default settings [ 23 ].
Drug Signatures Database (DSigDB, http://dsigdb.tanlab.org/DSigDBv1.0/ ) [ 24 ] is a sizable database with 22,527 gene sets and 17,389 unique compounds spanning 19,531 genes. We uploaded previously identified significant druggable genes to DSigDB to predict candidate drugs and evaluate the pharmacological activity of target genes.
We conducted molecular docking to assess the binding energies and interaction patterns between candidate drugs and their targets. By identifying ligands that exhibit high binding affinity and beneficial interaction patterns, we are able to prioritize drug targets for additional experimental validation and refine the design of prospective candidate drugs. Drug structural data were sourced from the PubChem Compound Database ( https://pubchem.ncbi.nlm.nih.gov/ ) [ 25 ] and downloaded in SDF format, then converted to pdb format using OpenBabel 2.4.1. Protein structural data were downloaded from the Protein Data Bank (PDB, http://www.rcsb.org/ ). The top five important drugs and the proteins encoded by the respective target genes were subjected to molecular docking using the computerized protein–ligand docking software AutoDock 4.2.6 ( http://autodock.scripps.edu/ ) [ 26 ], and the results were visualized using PyMol 3.0.2 ( https://www.pymol.org/ ). The final structures of six proteins and four drugs were obtained.
We obtained 3,953 druggable genes from the DGIdb (Table S1). Additionally, we acquired 4,463 druggable genes from previous reviews (Table S2) [ 11 ]. After integrating the data, we obtained 5,883 unique druggable genes named by the Human Genome Organisation Gene Nomenclature Committee for subsequent analysis (Table S3).
After intersecting eQTLs from blood and brain tissue with druggable genes respectively, the blood eQTLs contained 3,460 gene symbols, while the brain eQTLs had 2,624 gene symbols. We performed MR analysis and identified 24 significant genes associated with migraine from blood and 10 from brain tissue (Figs. 2 and 3 ). Among them, two genes, HMGCR and TGFB3, reached significance in both blood (HMGCR OR 1.38 and TGFB3 OR 0.88) and brain tissues (HMGCR OR 2.02 and TGFB3 OR 0.73). Detailed results for the significant IVs and full results of MR are available in the Table S4-S6.
Forest plot of 24 significant genes associated with migraine from blood
Forest plot of 10 significant genes associated with migraine from brain
We selected target genes associated with commonly used medications for migraine as comparisons for our study results [ 27 ]. These include CGRP-related gene (CALCB, CALCRL, RAMP1 and RAMP3), genes related to 5-hydroxytryptamine (5-HT) receptors targeted by ergot alkaloids, triptans, and ditans (HTR1B, HTR1D, HTR1F), γ-aminobutyric acid (GABA) receptor-related genes targeted by topiramate (GABRA1), calcium ion channel-related genes targeted by flunarizine (CACNA1H, CACNA1I, CALM1), and genes related to β-adrenoceptor targeted by propranolol (ADRB1, ADRB2). Among these genes (Fig. 4 ), CALM1 showed significant association with migraine in blood eQTL, but it lost significance after FDR correction (OR 0.92, P = 0.039, FDR-P = 0.455). In brain eQTL, CALCB and RAMP3 showed correlation with migraine, and after FDR correction, CALCB still maintained significance (CALCB: OR 0.68, P = 0.0001, FDR-P = 0.029; RAMP3: OR 1.16, P = 0.031, FDR-P = 0.425).
Forest plot of 13 genes associated with commonly used medications for migraine from blood and brain
The results indicated that, of the previously identified 24 significant genes from blood, 17 had a PPH4 greater than 0.75. Among the 10 significant genes from brain, 6 had a PPH4 greater than 0.75. HMGCR and TGFB3 showed significant colocalization results in both blood and brain tissues (Table 2 , Table 3 and Table S7).
Due to the presence of the blood–brain barrier, compared to various components in the blood and other organs, brain tissue is more difficult to be affected by the action of drugs [ 28 ]. Therefore, we used the IEU OpenGWAS Project to obtain the PheWAS results of SNPs corresponding to HMGCR and TGFB3 from blood, rather than from brain tissue. The results showed that HMGCR was highly correlated with low-density lipoprotein (LDL), and TGFB3 was primarily associated with the level of insulin-like growth factor 1 (IGF1). The complete results are available in the Table S8-S9.
Through GO analysis of 21 potential targets, we found that these targets are primarily involved in BP such as regulation of protein secretion (GO: 0050708), response to hypoxia (GO: 0001666), negative regulation of carbohydrate metabolic processes (GO: 0045912), and the intrinsic apoptotic signaling pathway in response to DNA damage by p53 class mediator (GO: 0042771). The main MF include transcription coregulator binding (GO: 0001221) and chromatin DNA binding (GO: 0031490, Fig. 5 ). To explore the potential therapeutic pathways of migraine-associated significant druggable genes, KEGG analysis indicates that the target genes were primarily enriched in pathways such as Human T-cell leukemia virus 1 infection (hsa05166) and the Cell cycle (hsa04110, Fig. 6 ).
GO enrichment results for three terms
KEGG enrichment results
We loaded 21 drug target genes into the STRING database to create a PPI network. The results, shown in Fig. 7 , displayed protein interaction pathways consisting of 21 nodes and 22 edges.
PPI network built with STRING
We used DSigDB to predict potentially effective intervention drugs and listed the top 10 potential intervention drugs based on the adjusted P -values (Table 4 ). The results indicated that butyric acid (butyric acid CTD 00007353) and clofibrate (clofibrate CTD 00005684) were the two most significant drugs, connected respectively to TGFB1, TGFB3, EP300, TP53 and TGFB1, CDK4, HMGCR, TP53. Additionally, arsenenous acid (Arsenenous acid CTD 00000922) and dexamethasone (dexamethasone CTD 00005779) were associated with most of the significant druggable genes.
We used AutoDock 4.2.6 to analyze the binding sites and interactions between the top 5 candidate drugs and the proteins encoded by the corresponding genes, generating the binding energy for each interaction. We obtained 14 effective docking results between the proteins and drugs (Table 5 ). Docking amino acid residues and hydrogen bond lengths are shown in Fig. 8 . Among these, the binding between CDK4 and andrographolide exhibited the lowest binding energy (-7.11 kcal/mol), indicating stable binding.
Molecular docking results of available proteins and drugs. a TGFB1 docking butyric acid, b TGFB1 docking clofibrate, c TGFB1 docking Sorafenib, d TGFB1 docking Andrographolide, e TGFB3 docking butyric acid, f EP300 docking butyric acid, g TP53 docking butyric acid, h CDK4 docking clofibrate, i CDK4 docking Sorafenib, j CDK4 docking Andrographolide, k HMGCR docking clofibrate, l TP53 docking clofibrate, m TP53 docking Sorafenib, n TP53 docking Andrographolide
This study integrated existing druggable gene targets with migraine GWAS data through MR and colocalization analysis, identifying 21 druggable genes significantly associated with migraine (BRPF3, CBFB, CDK4, CHD4, DDIT4, EP300, EPHA5, FGFRL1, FXN, HMGCR, HVCN1, KCNK5, MRGPRE, NLGN2, NR1D1, PLXNB1, TGFB1, TGFB3, THRA, TLN1 and TP53). To further illustrate the potential pleiotropy and drug side effects of significant druggable genes, we conducted a phenome-wide research of two SNPs associated with two druggable genes of interest (HMGCR and TGFB3). Additionally, we performed enrichment analysis and constructed PPI network for these 21 significant genes to understand the biological significance and interaction mechanisms of these drug targets. Finally, drug prediction and molecular docking were conducted to further validate the pharmaceutical value of these significant druggable genes.
The association between HMGCR and migraine has been supported by multiple prior studies. One study indicated that migraine has significant shared signals with certain lipoprotein subgroups at the HMGCR locus [ 29 ]. Hong et al. found that HMGCR genotypes associated with higher LDL cholesterol levels are linked to an increased risk of migraine [ 30 ]. Statins inhibit the activity of HMG-CoA reductase, which is encoded by the HMGCR gene, to exert their lipid-lowering effects and have been widely used in the prevention and treatment of coronary heart disease and ischemic stroke. Previous clinical research has shown that simvastatin combined with vitamin D can effectively prevent episodic migraines in adults [ 31 ]. Additionally, HMGCR may also be involved in immune modulation, with studies suggesting that migraine patients experience neuroinflammation due to activation of the trigeminal-vascular system, leading to peripheral and central sensitization of pain and triggering migraine attacks [ 32 , 33 ]. HMGCR inhibitors can suppress the production of inflammatory mediators and cytokines, thus reducing inflammatory responses [ 34 ]. We speculate that the role of HMGCR in regulating inflammation and immunity may have influenced the drug prediction results generated by DSigDB, which based on Gene Set Enrichment Analysis (GSEA) [ 24 , 35 , 36 ], diluting the role of HMGCR in regulating lipid metabolism. Therefore, statins did not appear in the predicted list of candidate drugs.
TGFB1 and TGFB3 encodes different secreted ligands of the transforming growth factor-beta (TGF-β) superfamily of proteins, namely TGF-β1 and TGF-β3. TGF-β is a pleiotropic cytokine closely associated with immunity and inflammation [ 37 ]. Research indicated that TGF-β3 can inhibit B cell proliferation and antibody production by suppressing the phosphorylation of NF-κB, thus exerting its anti-inflammatory effects [ 38 ]. The activation of the classical NF-κB pathway is a key mechanism that upregulates pro-inflammatory cytokines, promoting central sensitization and leading to the onset of chronic migraine [ 39 ]. A previous clinical study indicated that the serum levels of TGF-β1 are significantly elevated in migraine patients [ 40 ]. Ishizaki et al. found that TGF-β1 levels in the platelet poor plasma of migraine patients are significantly increased during headache-free intervals [ 41 ]. Bø et al. discovered that during acute migraine attacks, the levels of TGF-β1 in cerebrospinal fluid are significantly higher compared to the control group [ 42 ]. Although some studies consider TGF-β1 to be an anti-inflammatory cytokine [ 43 ], based on previous research and the results of this study, we believe that TGFB1 and its encoded protein, TGF-β1, are associated with an increased risk of migraine. The pleiotropic effects of TGF-β1 on inflammation may depend on concentration and environment [ 44 ]. In addition, we found an association between TGFB3 and IGF1 in our phenome-wide research. A previous MR study showed that increased levels of IGF1 are causally associated with decreased migraine risk [ 45 ]. Recent experimental results suggest that the miR-653-3p/IGF1 axis regulating the AKT/TRPV1 signaling pathway may be a potential pathogenic mechanism for migraine [ 46 ]. The beneficial effects of TGF-β3 and IGF1 on migraine may be associated with the regulation of gene expression in different microenvironments to promote the transition of microglial cells from M1 (pathogenic) to M2 (protective) phenotypes [ 47 ].
Among the 13 genes targeted by some commonly used migraine treatment drugs, the MR results for 3 genes were significant in blood or brain eQTL. Although only one gene remained significant after FDR correction, this still demonstrates that the significant genes newly identified in this study are reliable and have potential as drug targets to some extent. The lack of significance in certain drug target genes may be related to the insufficient sample size of the migraine GWAS data included in our study. It would be meaningful to validate the results of this study with more large-sample GWAS data available in the future.
In this study, DSigDB predicted 10 potential drugs for migraine, but current clinical research is mainly focused on melatonin and dexamethasone. ClinicalTrials ( https://clinicaltrials.gov/ ) has registered multiple studies on the efficacy of melatonin and dexamethasone for migraine. Many research findings differ differently and controversially. A published clinical study on acute treatment of pediatric migraine showed that both low and high doses of melatonin contributed to pain relief [ 48 ]. The consensus published by the Brazilian Headache Society in 2022 lists melatonin as a recommended medication for preventing episodic migraine (Class II; Level C) [ 49 ]. However, study indicated that bedtime administration of sustained-release melatonin did not lead to a reduction in migraine attack frequency compared to placebo [ 50 ]. Dexamethasone has shown good efficacy for severe acute migraine attacks [ 51 ]. The 2016 guidelines for the emergency treatment of acute migraines in adults, issued by the American Headache Society, mention that dexamethasone should be administered to prevent the recurrence of migraine (Should offer—Level B) [ 52 ]. But study suggested that dexamethasone does not reduce migraine recurrence [ 53 ].
An animal study has shown that clofibrate can improve oxidative stress and neuroinflammation caused by the exaggerated production of lipid peroxidation products [ 54 ]. Clofibrate can activate peroxisome-proliferator-activated receptors (PPAR) α, inhibit the activation of the NF-κB signaling pathway and the production of interleukin (IL)-6, exerting an anti-inflammatory effect [ 55 , 56 ]. Additionally, a recent animal study indicated the upregulation of astrocytic activation and glial fibrillary acidic protein (GFAP) expression in the trigeminal nucleus caudalis (TNC) in migraine mice model induced by recurrent dural infusion of inflammatory soup (IS). This was accompanied by the release of various cytokines, increased neuronal excitability, and promotion of central sensitization processes [ 57 ]. Clofibrate can reduce the activation of astrocytes and the expression of GFAP, thereby inhibiting neuroinflammation [ 54 ]. Andrographolide is a major bioactive constituent of Andrographis paniculata, has broad effects on various inflammatory and neurological disorders [ 58 , 59 , 60 ]. Although we did not find any migraine clinical trials related to clofibrate and andrographolide on PubMed and ClinicalTrials, we believe that the prospects for using clofibrate and andrographolide in the treatment of migraine are quite promising. We hope to see more research on the association of clofibrate and andrographolide with migraine in the future.
Our study has several advantages. First, we provided compelling genetic evidence about migraine drug targets using MR, utilizing the largest publicly available GWAS data to date. Additionally, colocalization analysis helps reduce false negatives and false positives to ensure the robustness of the results. Enrichment analysis and PPI illustrate the functional characteristics and regulatory relationships of these targets genes, providing potential avenues for migraine drug development. The drug predictions demonstrate the medicinal potential of these genes, and high binding activity from molecular docking indicates the strong potential of these genes as drug targets. Our research conducts a comprehensive evaluation from identifying migraine-related druggable genes to drug binding properties, proposing migraine drug targets with compelling evidence.
This study also includes several notable limitations. Firstly, the number of eQTL IVs in MR is limited, with most not exceeding three SNPs, which restricts the credibility of the MR results. Additionally, while MR offers valuable insights into causality, it assumes a linear connection between low-dose drug exposure and the exposure-outcome relationship, which may not fully replicate real-world clinical trials that typically assess high doses of drugs in a short timeframe. Therefore, MR results may not accurately reflect the effect sizes observed in actual clinical settings, nor fully predict the impacts of drugs. Secondly, the generalizability of this study is limited by its primary inclusion of individuals of European descent. Extrapolating the findings to individuals of other genetic ancestry populations requires further research and validation to ensure broader applicability. Thirdly, the study focuses mainly on cis-eQTLs and their relationship with migraine, potentially overlooking other regulatory and environmental factors that contribute to the complexity of the disease. Fourthly, while enrichment analysis is valuable, it has inherent limitations as it relies on predefined gene sets or pathways, which may not encompass all possible biological mechanisms or interactions. A lack of significant enrichment does not necessarily mean there is no biological relevance, and researchers should interpret results cautiously. Fifth, the accuracy of molecular docking analysis largely depends on the quality of the protein structures and ligands. While this method identified potential drug targets, it does not guarantee their efficacy in clinical settings. Subsequent experimental validation and clinical trials are necessary to confirm the therapeutic potential of the identified targets. Moreover, we only investigated the side effects of 2 significant druggable genes. The effects of drugs on targets are very broad, and many off-target effects cannot be explored through MR, requiring further basic and clinical trials to gain a more comprehensive understanding. Finally, the clinical relevance of our study results needs further validation; the lack of clinical data related to our study is a significant limitation.
This study utilized MR and colocalization analysis to identify 21 potential drug targets for migraine, two of which were significant in both blood and brain. These findings provide promising leads for more effective migraine treatments, potentially reducing drug development costs. The study contributes valuably to the field, highlighting the importance of these druggable genes significantly associated with migraine. Further clinical trials on drugs targeting these genes are necessary in the future.
The Migraine GWAS dataset provided by Hautakangas et al. can be obtained by contacting International Headache Genetics Consortium [ 5 ]. Other data can be obtained from the original literature and websites.
Expression quantitative trait loci
Genome-wide association studies
Calcitonin gene-related peptide
Single nucleotide polymorphisms
Instrumental variables
Drug-Gene Interaction Database
False discovery rate
International Headache Genetics Consortium
Transcriptional start site
Inverse-variance weighted
5-Hydroxytryptamine
γ-Aminobutyric acid
Posterior probabilities
Phenome-wide association study
Gene Ontology
Kyoto Encyclopedia of Genes and Genomes
Biological process
Molecular function
Cellular component
Protein–protein interaction
Drug Signatures Database
Protein Data Bank
Low-density lipoprotein
Gene Set Enrichment Analysis
Insulin-like growth factor 1
Transforming growth factor-beta
Peroxisome-proliferator-activated receptors
Interleukin
Glial fibrillary acidic protein
Trigeminal nucleus caudalis
Inflammatory soup
(2018) Headache Classification Committee of the International Headache Society (IHS) The International Classification of Headache Disorders, 3rd edition. Cephalalgia 38(1):1–211. https://doi.org/10.1177/0333102417738202
GBD Neurology Collaborators (2019) Global, regional, and national burden of neurological disorders, 1990–2016: a systematic analysis for the Global Burden of Disease Study 2016. Lancet Neurol. 18(5):459–480. https://doi.org/10.1016/s1474-4422(18)30499-x
Article Google Scholar
Choquet H, Yin J, Jacobson AS, Horton BH, Hoffmann TJ, Jorgenson E et al (2021) New and sex-specific migraine susceptibility loci identified from a multiethnic genome-wide meta-analysis. Commun Biol 4(1):864. https://doi.org/10.1038/s42003-021-02356-y
Article PubMed PubMed Central Google Scholar
Tanaka M, Szabó Á, Körtési T, Szok D, Tajti J, Vécsei L (2023) From CGRP to PACAP, VIP, and beyond: unraveling the next chapters in migraine treatment. Cells 12(22). http://doi.org/10.3390/cells12222649 .
Hautakangas H, Winsvold BS, Ruotsalainen SE, Bjornsdottir G, Harder AVE, Kogelman LJA et al (2022) Genome-wide analysis of 102,084 migraine cases identifies 123 risk loci and subtype-specific risk alleles. Nat Genet 54(2):152–160. https://doi.org/10.1038/s41588-021-00990-0
Article CAS PubMed PubMed Central Google Scholar
Qi T, Song L, Guo Y, Chen C, Yang J (2024) From genetic associations to genes: methods, applications, and challenges. Trends Genet. https://doi.org/10.1016/j.tig.2024.04.008
Article PubMed Google Scholar
Namba S, Konuma T, Wu KH, Zhou W, Okada Y (2022) A practical guideline of genomics-driven drug discovery in the era of global biobank meta-analysis. Cell Genom 2(10):100190. https://doi.org/10.1016/j.xgen.2022.100190
Burgess S, Timpson NJ, Ebrahim S, Davey Smith G (2015) Mendelian randomization: where are we now and where are we going? Int J Epidemiol 44(2):379–388. https://doi.org/10.1093/ije/dyv108
Storm CS, Kia DA, Almramhi MM, Bandres-Ciga S, Finan C, Hingorani AD et al (2021) Finding genetically-supported drug targets for Parkinson’s disease using Mendelian randomization of the druggable genome. Nat Commun 12(1):7342. https://doi.org/10.1038/s41467-021-26280-1
Freshour SL, Kiwala S, Cotto KC, Coffman AC, McMichael JF, Song JJ et al (2021) Integration of the Drug-Gene Interaction Database (DGIdb 4.0) with open crowdsource efforts. Nucleic Acids Res 49(D1):D1144–d1151. https://doi.org/10.1093/nar/gkaa1084
Article CAS PubMed Google Scholar
Finan C, Gaulton A, Kruger F A, Lumbers R T, Shah T, Engmann J, et al. (2017) The druggable genome and support for target identification and validation in drug development. Sci Transl Med 9(383). http://doi.org/10.1126/scitranslmed.aag1166 .
Su WM, Gu XJ, Dou M, Duan QQ, Jiang Z, Yin KF et al (2023) Systematic druggable genome-wide Mendelian randomisation identifies therapeutic targets for Alzheimer’s disease. J Neurol Neurosurg Psychiatry 94(11):954–961. https://doi.org/10.1136/jnnp-2023-331142
Võsa U, Claringbould A, Westra HJ, Bonder MJ, Deelen P, Zeng B et al (2021) Large-scale cis- and trans-eQTL analyses identify thousands of genetic loci and polygenic scores that regulate blood gene expression. Nat Genet 53(9):1300–1310. https://doi.org/10.1038/s41588-021-00913-z
Wang D, Liu S, Warrell J, Won H, Shi X, Navarro F C P, et al. (2018) Comprehensive functional genomic resource and integrative model for the human brain. Science 362(6420). http://doi.org/10.1126/science.aat8464 .
Hemani G, Zheng J, Elsworth B, Wade K H, Haberland V, Baird D, et al. (2018) The MR-Base platform supports systematic causal inference across the human phenome. Elife 7. http://doi.org/10.7554/eLife.34408 .
Consortium. G P, Auton A, Brooks L D, Durbin R M, Garrison E P, Kang H M, et al 2015 A global reference for human genetic variation. Nature 526(7571):68-74. http://doi.org/10.1038/nature15393
Staley JR, Blackshaw J, Kamat MA, Ellis S, Surendran P, Sun BB et al (2016) PhenoScanner: a database of human genotype-phenotype associations. Bioinformatics 32(20):3207–3209. https://doi.org/10.1093/bioinformatics/btw373
Burgess S, Davey Smith G, Davies NM, Dudbridge F, Gill D, Glymour MM et al (2019) Guidelines for performing Mendelian randomization investigations: update for summer 2023. Wellcome Open Res 4:186. https://doi.org/10.12688/wellcomeopenres.15555.3
Greco MF, Minelli C, Sheehan NA, Thompson JR (2015) Detecting pleiotropy in Mendelian randomisation studies with summary data and a continuous outcome. Stat Med 34(21):2926–2940. https://doi.org/10.1002/sim.6522
Bowden J, Davey Smith G, Burgess S (2015) Mendelian randomization with invalid instruments: effect estimation and bias detection through Egger regression. Int J Epidemiol 44(2):512–525. https://doi.org/10.1093/ije/dyv080
Giambartolomei C, Vukcevic D, Schadt EE, Franke L, Hingorani AD, Wallace C et al (2014) Bayesian test for colocalisation between pairs of genetic association studies using summary statistics. PLoS Genet 10(5):e1004383. https://doi.org/10.1371/journal.pgen.1004383
Yu G, Wang LG, Han Y, He QY (2012) clusterProfiler: an R package for comparing biological themes among gene clusters. OMICS 16(5):284–287. https://doi.org/10.1089/omi.2011.0118
Szklarczyk D, Kirsch R, Koutrouli M, Nastou K, Mehryary F, Hachilif R et al (2023) The STRING database in 2023: protein-protein association networks and functional enrichment analyses for any sequenced genome of interest. Nucleic Acids Res 51(D1):D638–d646. https://doi.org/10.1093/nar/gkac1000
Yoo M, Shin J, Kim J, Ryall KA, Lee K, Lee S et al (2015) DSigDB: drug signatures database for gene set analysis. Bioinformatics 31(18):3069–3071. https://doi.org/10.1093/bioinformatics/btv313
Kim S, Chen J, Cheng T, Gindulyte A, He J, He S et al (2023) PubChem 2023 update. Nucleic Acids Res 51(D1):D1373–d1380. https://doi.org/10.1093/nar/gkac956
Morris GM, Huey R, Lindstrom W, Sanner MF, Belew RK, Goodsell DS et al (2009) AutoDock4 and AutoDockTools4: Automated docking with selective receptor flexibility. J Comput Chem 30(16):2785–2791. https://doi.org/10.1002/jcc.21256
Zobdeh F, Ben Kraiem A, Attwood MM, Chubarev VN, Tarasov VV, Schiöth HB et al (2021) Pharmacological treatment of migraine: drug classes, mechanisms of action, clinical trials and new treatments. Br J Pharmacol 178(23):4588–4607. https://doi.org/10.1111/bph.15657
Pandit R, Chen L, Götz J (2020) The blood-brain barrier: physiology and strategies for drug delivery. Adv Drug Deliv Rev 165–166:1–14. https://doi.org/10.1016/j.addr.2019.11.009
Guo Y, Daghlas I, Gormley P, Giulianini F, Ridker PM, Mora S et al (2021) Phenotypic and Genotypic Associations Between Migraine and Lipoprotein Subfractions. Neurology 97(22):e2223–e2235. https://doi.org/10.1212/wnl.0000000000012919
Hong P, Han L, Wan Y (2024) Mendelian randomization study of lipid metabolism characteristics and migraine risk. Eur J Pain. https://doi.org/10.1002/ejp.2235
Buettner C, Nir RR, Bertisch SM, Bernstein C, Schain A, Mittleman MA et al (2015) Simvastatin and vitamin D for migraine prevention: A randomized, controlled trial. Ann Neurol 78(6):970–981. https://doi.org/10.1002/ana.24534
Ferrari MD, Klever RR, Terwindt GM, Ayata C, van den Maagdenberg AM (2015) Migraine pathophysiology: lessons from mouse models and human genetics. Lancet Neurol 14(1):65–80. https://doi.org/10.1016/s1474-4422(14)70220-0
Kursun O, Yemisci M, van den Maagdenberg A, Karatas H (2021) Migraine and neuroinflammation: the inflammasome perspective. J Headache Pain 22(1):55. https://doi.org/10.1186/s10194-021-01271-1
Greenwood J, Mason JC (2007) Statins and the vascular endothelial inflammatory response. Trends Immunol 28(2):88–98. https://doi.org/10.1016/j.it.2006.12.003
Subramanian A, Tamayo P, Mootha VK, Mukherjee S, Ebert BL, Gillette MA et al (2005) Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proc Natl Acad Sci U S A 102(43):15545–15550. https://doi.org/10.1073/pnas.0506580102
Mootha VK, Lindgren CM, Eriksson KF, Subramanian A, Sihag S, Lehar J et al (2003) PGC-1alpha-responsive genes involved in oxidative phosphorylation are coordinately downregulated in human diabetes. Nat Genet 34(3):267–273. https://doi.org/10.1038/ng1180
Sanjabi S, Zenewicz LA, Kamanaka M, Flavell RA (2009) Anti-inflammatory and pro-inflammatory roles of TGF-beta, IL-10, and IL-22 in immunity and autoimmunity. Curr Opin Pharmacol 9(4):447–453. https://doi.org/10.1016/j.coph.2009.04.008
Okamura T, Sumitomo S, Morita K, Iwasaki Y, Inoue M, Nakachi S et al (2015) TGF-β3-expressing CD4+CD25(-)LAG3+ regulatory T cells control humoral immune responses. Nat Commun 6:6329. https://doi.org/10.1038/ncomms7329
Sun S, Fan Z, Liu X, Wang L, Ge Z (2024) Microglia TREM1-mediated neuroinflammation contributes to central sensitization via the NF-κB pathway in a chronic migraine model. J Headache Pain 25(1):3. https://doi.org/10.1186/s10194-023-01707-w
Güzel I, Taşdemir N, Celik Y (2013) Evaluation of serum transforming growth factor β1 and C-reactive protein levels in migraine patients. Neurol Neurochir Pol 47(4):357–362. https://doi.org/10.5114/ninp.2013.36760
Ishizaki K, Takeshima T, Fukuhara Y, Araki H, Nakaso K, Kusumi M et al (2005) Increased plasma transforming growth factor-beta1 in migraine. Headache 45(9):1224–1228. https://doi.org/10.1111/j.1526-4610.2005.00246.x
Bø SH, Davidsen EM, Gulbrandsen P, Dietrichs E, Bovim G, Stovner LJ et al (2009) Cerebrospinal fluid cytokine levels in migraine, tension-type headache and cervicogenic headache. Cephalalgia 29(3):365–372. https://doi.org/10.1111/j.1468-2982.2008.01727.x
Yang L, Zhou Y, Zhang L, Wang Y, Zhang Y, Xiao Z (2023) Aryl hydrocarbon receptors improve migraine-like pain behaviors in rats through the regulation of regulatory T cell/T-helper 17 cell-related homeostasis. Headache 63(8):1045–1060. https://doi.org/10.1111/head.14599
Komai T, Okamura T, Inoue M, Yamamoto K, Fujio K (2018) Reevaluation of pluripotent cytokine TGF-β3 in immunity. Int J Mol Sci 19(8):2261. https://doi.org/10.3390/ijms19082261
Abuduxukuer R, Niu PP, Guo ZN, Xu YM, Yang Y (2022) Circulating insulin-like growth factor 1 levels and migraine risk: a mendelian randomization study. Neurol Ther 11(4):1677–1689. https://doi.org/10.1007/s40120-022-00398-w
Ye S, Wei L, Jiang Y, Yuan Y, Zeng Y, Zhu L et al (2024) Mechanism of NO(2)-induced migraine in rats: The exploration of the role of miR-653-3p/IGF1 axis. J Hazard Mater 465:133362. https://doi.org/10.1016/j.jhazmat.2023.133362
Ji J, Xue TF, Guo XD, Yang J, Guo RB, Wang J et al (2018) Antagonizing peroxisome proliferator-activated receptor γ facilitates M1-to-M2 shift of microglia by enhancing autophagy via the LKB1-AMPK signaling pathway. Aging Cell 17(4):e12774. https://doi.org/10.1111/acel.12774
Gelfand AA, Ross AC, Irwin SL, Greene KA, Qubty WF, Allen IE (2020) Melatonin for Acute Treatment of Migraine in Children and Adolescents: A Pilot Randomized Trial. Headache 60(8):1712–1721. https://doi.org/10.1111/head.13934
Santos PSF, Melhado EM, Kaup AO, Costa A, Roesler CAP, Piovesan ÉJ et al (2022) Consensus of the Brazilian Headache Society (SBCe) for prophylactic treatment of episodic migraine: part II. Arq Neuropsiquiatr 80(9):953–969. https://doi.org/10.1055/s-0042-1755320
Alstadhaug KB, Odeh F, Salvesen R, Bekkelund SI (2010) Prophylaxis of migraine with melatonin: a randomized controlled trial. Neurology 75(17):1527–1532. https://doi.org/10.1212/WNL.0b013e3181f9618c
Gelfand AA, Goadsby PJ (2012) A neurologist’s guide to acute migraine therapy in the emergency room. Neurohospitalist 2(2):51–59. https://doi.org/10.1177/1941874412439583
Orr SL, Friedman BW, Christie S, Minen MT, Bamford C, Kelley NE et al (2016) Management of Adults With Acute Migraine in the Emergency Department: The American Headache Society Evidence Assessment of Parenteral Pharmacotherapies. Headache 56(6):911–940. https://doi.org/10.1111/head.12835
Rowe BH, Colman I, Edmonds ML, Blitz S, Walker A, Wiens S (2008) Randomized controlled trial of intravenous dexamethasone to prevent relapse in acute migraine headache. Headache 48(3):333–340. https://doi.org/10.1111/j.1526-4610.2007.00959.x
Oyagbemi AA, Adebiyi OE, Adigun KO, Ogunpolu BS, Falayi OO, Hassan FO et al (2020) Clofibrate, a PPAR-α agonist, abrogates sodium fluoride-induced neuroinflammation, oxidative stress, and motor incoordination via modulation of GFAP/Iba-1/anti-calbindin signaling pathways. Environ Toxicol 35(2):242–253. https://doi.org/10.1002/tox.22861
Sánchez-Aguilar M, Ibarra-Lara L, Cano-Martínez A, Soria-Castro E, Castrejón-Téllez V, Pavón N, et al. (2023) PPAR Alpha Activation by Clofibrate Alleviates Ischemia/Reperfusion Injury in Metabolic Syndrome Rats by Decreasing Cardiac Inflammation and Remodeling and by Regulating the Atrial Natriuretic Peptide Compensatory Response. Int J Mol Sci 24(6). http://doi.org/10.3390/ijms24065321 .
Brown JD, Plutzky J (2007) Peroxisome proliferator-activated receptors as transcriptional nodal points and therapeutic targets. Circulation 115(4):518–533. https://doi.org/10.1161/circulationaha.104.475673
Zhang L, Lu C, Kang L, Li Y, Tang W, Zhao D et al (2022) Temporal characteristics of astrocytic activation in the TNC in a mice model of pain induced by recurrent dural infusion of inflammatory soup. J Headache Pain 23(1):8. https://doi.org/10.1186/s10194-021-01382-9
Patel R, Kaur K, Singh S (2021) Protective effect of andrographolide against STZ induced Alzheimer’s disease in experimental rats: possible neuromodulation and Aβ((1–42)) analysis. Inflammopharmacology 29(4):1157–1168. https://doi.org/10.1007/s10787-021-00843-6
Ahmed S, Kwatra M, Ranjan Panda S, Murty USN, Naidu VGM (2021) Andrographolide suppresses NLRP3 inflammasome activation in microglia through induction of parkin-mediated mitophagy in in-vitro and in-vivo models of Parkinson disease. Brain Behav Immun 91:142–158. https://doi.org/10.1016/j.bbi.2020.09.017
Ciampi E, Uribe-San-Martin R, Cárcamo C, Cruz JP, Reyes A, Reyes D et al (2020) Efficacy of andrographolide in not active progressive multiple sclerosis: a prospective exploratory double-blind, parallel-group, randomized, placebo-controlled trial. BMC Neurol 20(1):173. https://doi.org/10.1186/s12883-020-01745-w
Download references
The authors sincerely thank related investigators for sharing the statistics included in this study.
This study was funded by China National Natural Science Foundation (82374575, 82074179), Beijing Natural Science Foundation (7232270), Outstanding Young Talents Program of Capital Medial University (B2207), Capital’s Funds for Health Improvement and Research (CFH2024-2–2235).
Authors and affiliations.
Department of Acupuncture and Moxibustion, Beijing Hospital of Traditional Chinese Medicine, Capital Medical University, Beijing Key Laboratory of Acupuncture Neuromodulation, No. 23, Meishuguan Houjie, Beijing, 100010, China
Chengcheng Zhang & Lu Liu
State Key Laboratory of Networking and Switching Technology, Beijing University of Posts and Telecommunications, Beijing, 100876, China
You can also search for this author in PubMed Google Scholar
LL contributed to the study conception and design. CCZ and YWH performed the statistical analysis. CCZ drafted the manuscript. All authors commented on previous versions of the manuscript. All authors contributed to the article and approved the submitted version.
Correspondence to Lu Liu .
Ethics approval and consent to participate.
Not applicable.
All data analyzed during this study have been previously published.
The authors declare no competing interests.
Publisher’s note.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary material 1., rights and permissions.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ . The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/ ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
Reprints and permissions
Cite this article.
Zhang, C., He, Y. & Liu, L. Identifying therapeutic target genes for migraine by systematic druggable genome-wide Mendelian randomization. J Headache Pain 25 , 100 (2024). https://doi.org/10.1186/s10194-024-01805-3
Download citation
Received : 05 May 2024
Accepted : 05 June 2024
Published : 12 June 2024
DOI : https://doi.org/10.1186/s10194-024-01805-3
Anyone you share the following link with will be able to read this content:
Sorry, a shareable link is not currently available for this article.
Provided by the Springer Nature SharedIt content-sharing initiative
ISSN: 1129-2377
European Journal of Medical Research volume 29 , Article number: 327 ( 2024 ) Cite this article
234 Accesses
Metrics details
Some previous observational studies have linked deep venous thrombosis (DVT) to thyroid diseases; however, the findings were contradictory. This study aimed to investigate whether some common thyroid diseases can cause DVT using a two-sample Mendelian randomization (MR) approach.
This two-sample MR study used single nucleotide polymorphisms (SNPs) identified by the FinnGen genome-wide association studies (GWAS) to be highly associated with some common thyroid diseases, including autoimmune hyperthyroidism (962 cases and 172,976 controls), subacute thyroiditis (418 cases and 187,684 controls), hypothyroidism (26,342 cases and 59,827 controls), and malignant neoplasm of the thyroid gland (989 cases and 217,803 controls. These SNPs were used as instruments. Outcome datasets for the GWAS on DVT (6,767 cases and 330,392 controls) were selected from the UK Biobank data, which was obtained from the Integrative Epidemiology Unit (IEU) open GWAS project. The inverse variance weighted (IVW), MR-Egger and weighted median methods were used to estimate the causal association between DVT and thyroid diseases. The Cochran’s Q test was used to quantify the heterogeneity of the instrumental variables (IVs). MR Pleiotropy RESidual Sum and Outlier test (MR-PRESSO) was used to detect horizontal pleiotropy. When the causal relationship was significant, bidirectional MR analysis was performed to determine any reverse causal relationships between exposures and outcomes.
This MR study illustrated that autoimmune hyperthyroidism slightly increased the risk of DVT according to the IVW [odds ratio (OR) = 1.0009; p = 0.024] and weighted median methods [OR = 1.001; p = 0.028]. According to Cochran’s Q test, there was no evidence of heterogeneity in IVs. Additionally, MR-PRESSO did not detect horizontal pleiotropy ( p = 0.972). However, no association was observed between other thyroid diseases and DVT using the IVW, weighted median, and MR-Egger regression methods.
This study revealed that autoimmune hyperthyroidism may cause DVT; however, more evidence and larger sample sizes are required to draw more precise conclusions.
Deep venous thrombosis (DVT) is a common type of disease that occurs in 1–2 individuals per 1000 each year [ 1 ]. In the post-COVID-19 era, DVT showed a higher incidence rate [ 2 ]. Among hospitalized patients, the incidence rate of this disease was as high as 2.7% [ 3 ], increasing the risk of adverse events during hospitalization. According to the Registro Informatizado Enfermedad Tromboembolica (RIETE) registry, which included data from ~ 100,000 patients from 26 countries, the 30-day mortality rate was 2.6% for distal DVT and 3.3% for proximal DVT [ 4 ]. Other studies have shown that the one-year mortality rate of DVT is 19.6% [ 5 ]. DVT and pulmonary embolism (PE), collectively referred to as venous thromboembolism (VTE), constitute a major global burden of disease [ 6 ].
Thyroid diseases are common in the real world. Previous studies have focused on the relationship between DVT and thyroid diseases, including thyroid dysfunction and thyroid cancer. Some case reports [ 7 , 8 , 9 ] have demonstrated that hyperthyroidism is often associated with DVT and indicates a worse prognosis [ 10 ]. The relationship between thyroid tumors and venous thrombosis has troubled researchers for many years. In 1989, the first case of papillary thyroid carcinoma presenting with axillary vein thrombosis as the initial symptom was reported [ 11 ]. In 1995, researchers began to notice the relationship between thyroid tumors and hypercoagulability [ 12 ], laying the foundation for subsequent extensive research. However, the aforementioned observational studies had limitations, such as small sample sizes, selection bias, reverse causality, and confounding factors, which may have led to unreliable conclusions [ 13 ].
Previous studies have explored the relationship of thyroid disease and DVT and revealed that high levels of thyroid hormones may increase the risk of DVT. Hyperthyroidism promotes a procoagulant and hypofibrinolytic state by affecting the von Willebrand factor, factors VIII, IV, and X, fibrinogen, and plasminogen activator inhibitor-1 [ 14 , 15 ]. At the molecular level, researchers believe that thyroid hormones affect coagulation levels through an important nuclear thyroid hormone receptor (TR), TRβ [ 16 ], and participate in pathological coagulation through endothelial dysfunction. Thyroid hormones may have non-genetic effects on the behavior of endothelial cells [ 17 , 18 ]. In a study regarding tumor thrombosis, Lou [ 19 ] found that 303 circular RNAs were differentially expressed in DVT using microarray. Kyoto Encyclopedia of Genes and Genomes (KEGG) analysis revealed that the most significantly enriched pathways included thyroid hormone-signaling pathway and endocytosis, and also increased level of proteoglycans in cancer. This indicated that tumor cells and thyroid hormones might interact to promote thrombosis. Based on these studies, we speculated that thyroid diseases, including thyroid dysfunction and thyroid tumors, may cause DVT.
Mendelian randomization (MR) research is a causal inference technique that can be used to assess the causal relationship and reverse causation between specific exposure and outcome factors. If certain assumptions [ 20 ] are fulfilled, genetic variants can be employed as instrumental variables (IVs) to establish causal relationships. Bidirectional MR analysis can clarify the presence of reverse causal relationships [ 21 ], making the conclusions more comprehensive. Accordingly, we aimed to apply a two-sample MR strategy to investigate whether DVT is related to four thyroid diseases, including autoimmune hyperthyroidism, subacute thyroiditis, hypothyroidism, and thyroid cancer.
MR relies on single nucleotide polymorphisms (SNPs) as IVs. The IVs should fulfill the following three criteria [ 22 ]: (1) IVs should be strongly associated with exposure. (2) Genetic variants must be independent of unmeasured confounding factors that may affect the exposure–outcome association. (3) IVs are presumed to affect the outcome only through their associations with exposure (Fig. 1 ). IVs that met the above requirements were used to estimate the relationship between exposure and outcome. Our study protocol conformed to the STROBE-MR Statement [ 23 ], and all methods were performed in accordance with the relevant guidelines and regulations.
The relationship between instrumental variables, exposure, outcome, and confounding factors
Datasets (Table 1 ) in this study were obtained from a publicly available database (the IEU open genome-wide association studies (GWAS) project [ 24 ] ( https://gwas.mrcieu.ac.uk )). There was no overlap in samples between the data sources of outcome and exposures. Using de-identified summary-level data, privacy information such as overall age and gender were hidden. Ethical approval was obtained for all original work. This study complied with the terms of use of the database.
MR analysis was performed using the R package “TwoSampleMR”. SNPs associated with each thyroid disease at the genome-wide significance threshold of p < 5.0 × 10 –8 were selected as potential IVs. To ensure independence between the genetic variants used as IVs, the linkage disequilibrium (LD) threshold for grouping was set to r 2 < 0.001 with a window size of 10,000 kb. The SNP with the lowest p -value at each locus was retained for analyses.
Multiple MR methods were used to infer causal relationships between thyroid diseases and DVT, including the inverse variance weighted (IVW), weighted median, and MR-Egger tests, after harmonizing the SNPs across the GWASs of exposures and outcomes. The main analysis was conducted using the IVW method. Heterogeneity and pleiotropy were also performed in each MR analysis. Meanwhile, the MR-PRESSO Global test [ 25 ] was utilized to detect horizontal pleiotropy. The effect trend of SNP was observed through a scatter plot, and the forest plot was used to observe the overall effects. When a significant causal relationship was confirmed by two-sample MR analysis, bidirectional MR analysis was performed to assess reverse causal relationships by swapping exposure and outcome factors. Parameters were set the same as before. All abovementioned statistical analyses were performed using the package TwoSampleMR (version 0.5.7) in the R program (version 4.2.1).
After harmonizing the SNPs across the GWASs for exposures and outcomes, the IVW (OR = 1.0009, p = 0.024, Table 2 ) and weighted median analyses (OR = 1.001, p = 0.028) revealed significant causal effects between autoimmune hyperthyroidism and DVT risk. Similar results were observed using the weighted median approach Cochran’s Q test, MR-Egger intercept, and MR-PRESSO tests suggested that the results were not influenced by pleiotropy and heterogeneity (Table 2 ). However, the leave-one-out analysis revealed a significant difference after removing some SNPs (rs179247, rs6679677, rs72891915, and rs942495, p < 0.05, Figure S2a), indicating that MR results were dependent on these SNPs (Figure S2, Table S1). No significant effects were observed in other thyroid diseases (Table 2 ). The estimated scatter plot of the association between thyroid diseases and DVT is presented in Fig. 2 , indicating a positive causal relationship between autoimmune hyperthyroidism and DVT (Fig. 2 a). The forest plots of single SNPs affecting the risk of DVT are displayed in Figure S1.
The estimated scatter plot of the association between thyroid diseases and DVT. MR-analyses are derived using IVW, MR-Egger, weighted median and mode. By fitting different models, the scatter plot showed the relationship between SNP and exposure factors, predicting the association between SNP and outcomes
Bidirectional MR analysis was performed to further determine the relationship between autoimmune hyperthyroidism and DVT. The reverse causal relationship was not observed (Table S2), which indicated that autoimmune hyperthyroidism can cause DVT from a mechanism perspective.
This study used MR to assess whether thyroid diseases affect the incidence of DVT. The results showed that autoimmune hyperthyroidism can increase the risk of DVT occurrence, but a reverse causal relationship was not observed between them using bidirectional MR analysis. However, other thyroid diseases, such as subacute thyroiditis, hypothyroidism, and thyroid cancer, did not show a similar effect.
Recently, several studies have suggested that thyroid-related diseases may be associated with the occurrence of DVT in the lower extremities, which provided etiological clues leading to the occurrence of DVT in our subsequent research. In 2006, a review mentioned the association between thyroid dysfunction and coagulation disorders [ 26 ], indicating a hypercoagulable state in patients with hyperthyroidism. In 2011, a review further suggested a clear association between hypothyroidism and bleeding tendency, while hyperthyroidism appeared to increase the risk of thrombotic events, particularly cerebral venous thrombosis [ 27 ]. A retrospective cohort study [ 28 ] supported this conclusion, but this study only observed a higher proportion of concurrent thyroid dysfunction in patients with cerebral venous thrombosis. The relationship between thyroid function and venous thromboembolism remains controversial. Krieg VJ et al. [ 29 ] found that hypothyroidism has a higher incidence rate in patients with chronic thromboembolic pulmonary hypertension and may be associated with more severe disease, which seemed to be different from previous views that hyperthyroidism may be associated with venous thrombosis. Alsaidan [ 30 ] also revealed that the risk of developing venous thrombosis was almost increased onefold for cases with a mild-to-moderate elevation of thyroid stimulating hormone and Free thyroxine 4(FT4). In contrast, it increased twofold for cases with a severe elevation of thyroid stimulating hormone and FT4. Raised thyroid hormones may increase the synthesis or secretion of coagulation factors or may decrease fibrinolysis, which may lead to the occurrence of coagulation abnormality.
Other thyroid diseases are also reported to be associated with DVT. In a large prospective cohort study [ 31 ], the incidence of venous thromboembolism was observed to increase in patients with thyroid cancer over the age of 60. However, other retrospective studies did not find any difference compared with the general population [ 32 ]. In the post-COVID-19 era, subacute thyroiditis has received considerable attention from researchers. New evidence suggests that COVID-19 may be associated with subacute thyroiditis [ 33 , 34 ]. Mondal et al. [ 35 ] found that out of 670 COVID-19 patients, 11 presented with post-COVID-19 subacute thyroiditis. Among them, painless subacute thyroiditis appeared earlier and exhibited symptoms of hyperthyroidism. Another case report also indicated the same result, that is, subacute thyroiditis occurred after COVID-19 infection, accompanied by thyroid function changes [ 36 ]. This led us to hypothesize that subacute thyroiditis may cause DVT through alterations in thyroid function.
This study confirmed a significant causal relationship between autoimmune hyperthyroidism and DVT ( p = 0.02). The data were tested for heterogeneity and gene pleiotropy using MR-Egger, Cochran’s Q, and MR-PRESSO tests. There was no evidence that the results were influenced by pleiotropy or heterogeneity. In the leave-one-out analysis, four of the five selected SNPs showed significant effects of autoimmune hyperthyroidism on DVT, suggesting an impact of these SNPs on DVT outcome. Previous studies have focused on the relationship between hyperthyroidism and its secondary arrhythmias and arterial thromboembolism [ 37 , 38 ]. This study emphasized the risk of DVT in patients with hyperthyroidism, which has certain clinical implications. Prophylactic anticoagulant therapy was observed to help prevent DVT in patients with hyperthyroidism. Unfortunately, the results of this study did not reveal any evidence that suggests a relationship between other thyroid diseases and DVT occurrence. This may be due to the limited database, as this study only included the GWAS data from a subset of European populations. Large-scale multiracial studies are needed in the future.
There are some limitations to this study. First, it was limited to participants of European descent. Consequently, further investigation is required to confirm these findings in other ethnicities. Second, this study did not reveal the relationship between complications of hyperthyroidism and DVT. Additionally, this study selected IVs from the database using statistical methods rather than selecting them from the real population. This may result in weaker effects of the screened IVs and reduce the clinical significance of MR analysis. Moreover, the definitions of some diseases in this study were not clear in the original database, and some of the diseases were self-reported, which may reduce the accuracy of diagnosis. Further research is still needed to clarify the causal relationship between DVT and thyroid diseases based on prospective cohort and randomized controlled trials (RCTs).
This study analyzed large-scale genetic data and provided evidence of a causal relationship between autoimmune hyperthyroidism and the risk of DVT, Compared with the other thyroid diseases investigated. Prospective RCTs or MR studies with larger sample sizes are still needed to draw more precise conclusions.
The IEU open gwas project, https://gwas.mrcieu.ac.uk/
Ortel TL, Neumann I, Ageno W, et al. American society of hematology 2020 guidelines for management of venous thromboembolism: treatment of deep vein thrombosis and pulmonary embolism. Blood Adv. 2020;4(19):4693–738.
Article CAS PubMed PubMed Central Google Scholar
Mehrabi F, Farshbafnadi M, Rezaei N. Post-discharge thromboembolic events in COVID-19 patients: a review on the necessity for prophylaxis. Clin Appl Thromb Hemost. 2023;29:10760296221148476.
Article PubMed PubMed Central Google Scholar
Loffredo L, Vidili G, Sciacqua A, et al. Asymptomatic and symptomatic deep venous thrombosis in hospitalized acutely ill medical patients: risk factors and therapeutic implications. Thromb J. 2022;20(1):72.
RIETE Registry. Death within 30 days. RIETE Registry. 2022[2023.8.23]. https://rieteregistry.com/graphics-interactives/dead-30-days/ .
Minges KE, Bikdeli B, Wang Y, Attaran RR, Krumholz HM. National and regional trends in deep vein thrombosis hospitalization rates, discharge disposition, and outcomes for medicare beneficiaries. Am J Med. 2018;131(10):1200–8.
Di Nisio M, van Es N, Büller HR. Deep vein thrombosis and pulmonary embolism. Lancet. 2016;388(10063):3060–73.
Article PubMed Google Scholar
Aquila I, Boca S, Caputo F, et al. An unusual case of sudden death: is there a relationship between thyroid disorders and fatal pulmonary thromboembolism? A case report and review of literature. Am J Forensic Med Pathol. 2017;38(3):229–32.
Katić J, Katić A, Katić K, Duplančić D, Lozo M. Concurrent deep vein thrombosis and pulmonary embolism associated with hyperthyroidism: a case report. Acta Clin Croat. 2021;60(2):314–6.
PubMed PubMed Central Google Scholar
Hieber M, von Kageneck C, Weiller C, Lambeck J. Thyroid diseases are an underestimated risk factor for cerebral venous sinus thrombosis. Front Neurol. 2020;11:561656.
Pohl KR, Hobohm L, Krieg VJ, et al. Impact of thyroid dysfunction on short-term outcomes and long-term mortality in patients with pulmonary embolism. Thromb Res. 2022;211:70–8.
Article CAS PubMed Google Scholar
Sirota DK. Axillary vein thrombosis as the initial symptom in metastatic papillary carcinoma of the thyroid. Mt Sinai J Med. 1989;56(2):111–3.
CAS PubMed Google Scholar
Raveh E, Cohen M, Shpitzer T, Feinmesser R. Carcinoma of the thyroid: a cause of hypercoagulability? Ear Nose Throat J. 1995;74(2):110–2.
Davey Smith G, Hemani G. Mendelian randomization: genetic anchors for causal inference in epidemiological studies. Hum Mol Genet. 2014;23(R1):R89–98.
Stuijver DJ, van Zaane B, Romualdi E, Brandjes DP, Gerdes VE, Squizzato A. The effect of hyperthyroidism on procoagulant, anticoagulant and fibrinolytic factors: a systematic review and meta-analysis. Thromb Haemost. 2012;108(6):1077–88.
PubMed Google Scholar
Son HM. Massive cerebral venous sinus thrombosis secondary to Graves’ disease. Yeungnam Univ J Med. 2019;36(3):273–80.
Elbers LP, Moran C, Gerdes VE, et al. The hypercoagulable state in hyperthyroidism is mediated via the thyroid hormone β receptor pathway. Eur J Endocrinol. 2016;174(6):755–62.
Davis PJ, Sudha T, Lin HY, et al. Thyroid hormone, hormone analogs, and angiogenesis. Compr Physiol. 2015;6(1):353–62.
Mousa SA, Lin HY, Tang HY, et al. Modulation of angiogenesis by thyroid hormone and hormone analogues: implications for cancer management. Angiogenesis. 2014;17(3):463–9.
Lou Z, Li X, Li C, et al. Microarray profile of circular RNAs identifies hsa_circ_000455 as a new circular RNA biomarker for deep vein thrombosis. Vascular. 2022;30(3):577–89.
Hemani G, Bowden J, Davey SG. Evaluating the potential role of pleiotropy in Mendelian randomization studies. Hum Mol Genet. 2018;27(R2):R195–208.
Zhang Z, Li L, Hu Z, et al. Causal effects between atrial fibrillation and heart failure: evidence from a bidirectional Mendelian randomization study. BMC Med Genomics. 2023;16(1):187.
Emdin CA, Khera AV, Kathiresan S. Mendelian randomization. JAMA. 2017;318(19):1925–6.
Skrivankova VW, Richmond RC, Woolf BAR, et al. Strengthening the reporting of observational studies in epidemiology using Mendelian randomization: the STROBE-MR statement. JAMA. 2021;326(16):1614–21.
Hemani G, Zheng J, Elsworth B, et al. The MR-Base platform supports systematic causal inference across the human phenome. Elife. 2018;7: e34408.
Verbanck M, Chen CY, Neale B, Do R. Detection of widespread horizontal pleiotropy in causal relationships inferred from Mendelian randomization between complex traits and diseases. Nat Genet. 2018;50(5):693–8.
Franchini M. Hemostatic changes in thyroid diseases: haemostasis and thrombosis. Hematology. 2006;11(3):203–8.
Franchini M, Lippi G, Targher G. Hyperthyroidism and venous thrombosis: a casual or causal association? A systematic literature review. Clin Appl Thromb Hemost. 2011;17(4):387–92.
Fandler-Höfler S, Pilz S, Ertler M, et al. Thyroid dysfunction in cerebral venous thrombosis: a retrospective cohort study. J Neurol. 2022;269(4):2016–21.
Krieg VJ, Hobohm L, Liebetrau C, et al. Risk factors for chronic thromboembolic pulmonary hypertension—importance of thyroid disease and function. Thromb Res. 2020;185:20–6.
Alsaidan AA, Alruwiali F. Association between hyperthyroidism and thromboembolism: a retrospective observational study. Ann Afr Med. 2023;22(2):183–8.
Walker AJ, Card TR, West J, Crooks C, Grainge MJ. Incidence of venous thromboembolism in patients with cancer—a cohort study using linked United Kingdom databases. Eur J Cancer. 2013;49(6):1404–13.
Ordookhani A, Motazedi A, Burman KD. Thrombosis in thyroid cancer. Int J Endocrinol Metab. 2017;16(1): e57897.
Ziaka M, Exadaktylos A. Insights into SARS-CoV-2-associated subacute thyroiditis: from infection to vaccine. Virol J. 2023;20(1):132.
Henke K, Odermatt J, Ziaka M, Rudovich N. Subacute thyroiditis complicating COVID-19 infection. Clin Med Insights Case Rep. 2023;16:11795476231181560.
Mondal S, DasGupta R, Lodh M, Ganguly A. Subacute thyroiditis following recovery from COVID-19 infection: novel clinical findings from an Eastern Indian cohort. Postgrad Med J. 2023;99(1172):558–65.
Nham E, Song E, Hyun H, et al. Concurrent subacute thyroiditis and graves’ disease after COVID-19: a case report. J Korean Med Sci. 2023;38(18): e134.
Mouna E, Molka BB, Sawssan BT, et al. Cardiothyreosis: epidemiological, clinical and therapeutic approach. Clin Med Insights Cardiol. 2023;17:11795468231152042.
Maung AC, Cheong MA, Chua YY, Gardner DS. When a storm showers the blood clots: a case of thyroid storm with systemic thromboembolism. Endocrinol Diabetes Metab Case Rep. 2021;2021:20–0118.
Download references
Not applicable.
Lifeng Zhang and Kaibei Li have contributed equally to this work and share the first authorship.
Department of Vascular Surgery, Hospital of Chengdu University of Traditional Chinese Medicine, No. 39, Shierqiao Road, Jinniu District, Chengdu, 610072, Sichuan, People’s Republic of China
Lifeng Zhang, Qifan Yang, Yao Lin, Caijuan Geng, Wei Huang & Wei Zeng
Disinfection Supply Center, Hospital of Chengdu University of Traditional Chinese Medicine, No. 39, Shierqiao Road, Jin Niu District, Chengdu, 610072, Sichuan, People’s Republic of China
You can also search for this author in PubMed Google Scholar
Conception and design: LFZ and WZ. Analysis and interpretation: LFZ, KBL and WZ. Data collection: LFZ, QFY, YL, CJG and WH. Writing the article: LFZ, KBL. Critical revision of the article: LFZ, GFY and WZ. Final approval of the article: LFZ, KBL, YL, CJG, WH, QFY and WZ. Statistical analysis: YL, QFY.
Correspondence to Wei Zeng .
Ethics approval and consent to participate.
Ethical approval was obtained in all original studies. This study complies with the terms of use of the database.
Additional information, publisher's note.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Additional file 1., rights and permissions.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ . The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/ ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
Reprints and permissions
Cite this article.
Zhang, L., Li, K., Yang, Q. et al. Associations between deep venous thrombosis and thyroid diseases: a two-sample bidirectional Mendelian randomization study. Eur J Med Res 29 , 327 (2024). https://doi.org/10.1186/s40001-024-01933-1
Download citation
Received : 12 September 2023
Accepted : 09 June 2024
Published : 14 June 2024
DOI : https://doi.org/10.1186/s40001-024-01933-1
Anyone you share the following link with will be able to read this content:
Sorry, a shareable link is not currently available for this article.
Provided by the Springer Nature SharedIt content-sharing initiative
ISSN: 2047-783X
Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.
International Journal of Impotence Research ( 2024 ) Cite this article
732 Accesses
725 Altmetric
Metrics details
The proliferation of microplastics (MPs) represents a burgeoning environmental and health crisis. Measuring less than 5 mm in diameter, MPs have infiltrated atmospheric, freshwater, and terrestrial ecosystems, penetrating commonplace consumables like seafood, sea salt, and bottled beverages. Their size and surface area render them susceptible to chemical interactions with physiological fluids and tissues, raising bioaccumulation and toxicity concerns. Human exposure to MPs occurs through ingestion, inhalation, and dermal contact. To date, there is no direct evidence identifying MPs in penile tissue. The objective of this study was to assess for potential aggregation of MPs in penile tissue. Tissue samples were extracted from six individuals who underwent surgery for a multi-component inflatable penile prosthesis (IPP). Samples were obtained from the corpora using Adson forceps before corporotomy dilation and device implantation and placed into cleaned glassware. A control sample was collected and stored in a McKesson specimen plastic container. The tissue fractions were analyzed using the Agilent 8700 Laser Direct Infrared (LDIR) Chemical Imaging System (Agilent Technologies. Moreover, the morphology of the particles was investigated by a Zeiss Merlin Scanning Electron Microscope (SEM), complementing the detection range of LDIR to below 20 µm. MPs via LDIR were identified in 80% of the samples, ranging in size from 20–500 µm. Smaller particles down to 2 µm were detected via SEM. Seven types of MPs were found in the penile tissue, with polyethylene terephthalate (47.8%) and polypropylene (34.7%) being the most prevalent. The detection of MPs in penile tissue raises inquiries on the ramifications of environmental pollutants on sexual health. Our research adds a key dimension to the discussion on man-made pollutants, focusing on MPs in the male reproductive system.
This is a preview of subscription content, access via your institution
Subscribe to this journal
Receive 8 print issues and online access
251,40 € per year
only 31,43 € per issue
Buy this article
Prices may be subject to local taxes which are calculated during checkout
Data availability.
All relevant data to the current study that was generated and analyzed is available upon reasonable request from the corresponding author.
Schwabl P, Köppel S, Königshofer P, Bucsics T, Trauner M, Reiberger T, et al. Detection of various microplastics in human stool: a prospective case series. Ann Intern Med. 2019;171:453–7.
Article PubMed Google Scholar
Zhu L, Zhu J, Zuo R, Xu Q, Qian Y, An L. Identification of microplastics in human placenta using laser direct infrared spectroscopy. Sci Total Environ. 2023;856:159060
Article CAS PubMed Google Scholar
Ragusa A, Svelato A, Santacroce C, Catalano P, Notarstefano V, Carnevali O, et al. Plasticenta: first evidence of microplastics in human placenta. Environ Int. 2021;146:106274.
Amato-Lourenço LF, Carvalho-Oliveira R, Júnior GR, Dos Santos Galvão L, Ando RA, Mauad T. Presence of airborne microplastics in human lung tissue. J Hazard Mater. 2021;416:126124.
Jenner LC, Rotchell JM, Bennett RT, Cowen M, Tentzeris V, Sadofsky LR. Detection of microplastics in human lung tissue using μFTIR spectroscopy. Sci Total Environ. 2022;831:154907.
Yang Y, Xie E, Du Z, Peng Z, Han Z, Li L, et al. Detection of various microplastics in patients undergoing cardiac surgery. Environ Sci Technol. 2023;57:10911–8.
Wang C, Zhao J, Xing B. Environmental source, fate, and toxicity of microplastics. J Hazard Mater. 2021;407:124357.
da Silva Brito WA, Mutter F, Wende K, Cecchini AL, Schmidt A, Bekeschus S. Consequences of nano and microplastic exposure in rodent models: the known and unknown. Part Fibre Toxicol. 2022;19:28.
Article PubMed PubMed Central Google Scholar
Wright SL, Kelly FJ. Plastic and human health: a micro issue? Environ Sci Technol. 2017;51:6634–47.
Ragusa A, Notarstefano V, Svelato A, Belloni A, Gioacchini G, Blondeel C, et al. Raman microspectroscopy detection and characterisation of microplastics in human breastmilk. Polymers. 2022;14:2700.
Article CAS PubMed PubMed Central Google Scholar
Cox KD, Covernton GA, Davies HL, Dower JF, Juanes F, Dudas SE. Human consumption of microplastics. Environ Sci Technol. 2019;53:7068–74.
Barceló D, Picó Y, Alfarhan AH. Microplastics: detection in human samples, cell line studies, and health impacts. Environ Toxicol Pharmacol. 2023;101:104204.
Gautam R, Jo J, Acharya M, Maharjan A, Lee D, KC PB. et al. Evaluation of potential toxicity of polyethylene microplastics on human derived cell lines. Sci Total Environ. 2022;838:156089
Sorci G, Loiseau C. Should we worry about the accumulation of microplastics in human organs? EBioMedicine. 2022;82:104191.
Wang W, Ge J, Yu X. Bioavailability and toxicity of microplastics to fish species: A review. Ecotoxicol Environ Saf. 2020;189:109913.
Yong CQY, Valiyaveettil S, Tang BL. Toxicity of microplastics and nanoplastics in mammalian systems. Int J Environ Res Public Health. 2020;17:1509.
D’Angelo S, Meccariello R. Microplastics: a threat for male fertility. Int J Environ Res Public Health. 2021;18:2392.
Hou B, Wang F, Liu T, Wang Z. Reproductive toxicity of polystyrene microplastics: In vivo experimental study on testicular toxicity in mice. J Hazard Mater. 2021;405:124028.
Jaeger VK, Walker UA. Erectile dysfunction in systemic sclerosis. Curr Rheumatol Rep. 2016;18:49.
Jung J, Jo HW, Kwon H, Jeong NY. Clinical neuroanatomy and neurotransmitter-mediated regulation of penile erection. Int Neurourol J. 2014;18:58–62.
Sopko NA, Hannan JL, Bivalacqua TJ. Understanding and targeting the Rho kinase pathway in erectile dysfunction. Nat Rev Urol. 2014;11:622–8.
Sorkhi S, Sanchez CC, Cho MC, Cho SY, Chung H, Park MG, et al. Transpelvic magnetic stimulation enhances penile microvascular perfusion in a rat model: a novel interventional strategy to prevent penile fibrosis after cavernosal nerve injury. World J Mens Health. 2022;40:501–8.
Hildebrandt L, Zimmermann T, Primpke S, Fischer D, Gerdts G, Pröfrock D. Comparison and uncertainty evaluation of two centrifugal separators for microplastic sampling. J Hazard Mater. 2021;414:125482.
Morgado V, Palma C, Bettencourt da Silva RJN. Bottom-up evaluation of the uncertainty of the quantification of microplastics contamination in sediment samples. Environ Sci Technol. 2022;56:11080–90.
Hildebrandt L, El Gareb F, Zimmermann T, Klein O, Kerstan A, Emeis KC, et al. Spatial distribution of microplastics in the tropical Indian Ocean based on laser direct infrared imaging and microwave-assisted matrix digestion. Environ Pollut Barking Essex 1987. 2022;307:119547.
CAS Google Scholar
Hansen J, Hildebrandt L, Zimmermann T, El Gareb F, Fischer EK, Pröfrock D. Quantification and characterization of microplastics in surface water samples from the Northeast Atlantic Ocean using laser direct infrared imaging. Mar Pollut Bull. 2023;190:114880.
Rani M, Ducoli S, Depero LE, Prica M, Tubić A, Ademovic Z, et al. A complete guide to extraction methods of microplastics from complex environmental matrices. Molecules. 2023;28:5710.
Enders K, Lenz R, Beer S, Stedmon CA. Extraction of microplastic from biota: recommended acidic digestion destroys common plastic polymers. ICES J Mar Sci. 2017;74:326–31.
Article Google Scholar
Lopes C, Fernández-González V, Muniategui-Lorenzo S, Caetano M, Raimundo J. Improved methodology for microplastic extraction from gastrointestinal tracts of fat fish species. Mar Pollut Bull. 2022;181:113911.
Barboza LGA, Dick Vethaak A, Lavorante BRBO, Lundebye AK, Guilhermino L. Marine microplastic debris: an emerging issue for food security, food safety and human health. Mar Pollut Bull. 2018;133:336–48.
Wang S, Lu W, Cao Q, Tu C, Zhong C, Qiu L, et al. Microplastics in the lung tissues associated with blood test index. Toxics. 2023;11:759.
Ribeiro VV, Nobre CR, Moreno BB, Semensatto D, Sanz-Lazaro C, Moreira LB, et al. Oysters and mussels as equivalent sentinels of microplastics and natural particles in coastal environments. Sci Total Environ. 2023;874:162468.
Ourgaud M, Phuong NN, Papillon L, Panagiotopoulos C, Galgani F, Schmidt N, et al. Identification and quantification of microplastics in the marine environment using the laser direct infrared (LDIR) technique. Environ Sci Technol. 2022;56:9999–10009.
Zhao Q, Zhu L, Weng J, Jin Z, Cao Y, Jiang H, et al. Detection and characterization of microplastics in the human testis and semen. Sci Total Environ. 2023;877:162713.
Wu P, Lin S, Cao G, Wu J, Jin H, Wang C, et al. Absorption, distribution, metabolism, excretion and toxicity of microplastics in the human body and health implications. J Hazard Mater. 2022;437:129361.
Urbanek AK, Rymowicz W, Mirończuk AM. Degradation of plastics and plastic-degrading bacteria in cold marine habitats. Appl Microbiol Biotechnol. 2018;102:7669–78.
Jin Y, Qiu J, Zhang L, Zhu M. [Biodegradation of polyethylene terephthalate: a review]. Sheng Wu Gong Cheng Xue Bao Chin J Biotechnol. 2023;39:4445–62.
Çaykara T, Sande MG, Azoia N, Rodrigues LR, Silva CJ. Exploring the potential of polyethylene terephthalate in the design of antibacterial surfaces. Med Microbiol Immunol. 2020;209:363–72.
Sharifinia M, Bahmanbeigloo ZA, Keshavarzifard M, Khanjani MH, Lyons BP. Microplastic pollution as a grand challenge in marine research: A closer look at their adverse impacts on the immune and reproductive systems. Ecotoxicol Environ Saf. 2020;204:111109.
Potential toxicity of polystyrene microplastic particles. Scientific Reports. Available from: https://www.nature.com/articles/s41598-020-64464-9
Zhang C, Chen J, Ma S, Sun Z, Wang Z. Microplastics may be a significant cause of male infertility. Am J Mens Health. 2022;16:15579883221096549.
Compa M, Capó X, Alomar C, Deudero S, Sureda A. A meta-analysis of potential biomarkers associated with microplastic ingestion in marine fish. Environ Toxicol Pharmacol. 2024;107:104414.
Hildebrandt L, Nack FL, Zimmermann T, Pröfrock D. Microplastics as a Trojan horse for trace metals. J Hazard Mater Lett. 2021;2:100035.
Article CAS Google Scholar
Download references
Authors and affiliations.
Desai Sethi Urology Institute, Miller School of Medicine, University of Miami, Miami, FL, USA
Jason Codrington, Alexandra Aponte Varnum, Joginder Bidhan, Kajal Khodamoradi, Aymara Evans, David Velasquez, Christina C. Yarborough, Ashutosh Agarwal, Edoardo Pozzi, Francesco Mesquita, Francis Petrella, David Miller & Ranjith Ramasamy
Institute of Coastal Environmental Chemistry, Department for Inorganic Environmental Chemistry, Helmholtz-Zentrum Hereon, Max-Planck-Str 1, 21502, Geesthacht, Germany
Lars Hildebrandt & Daniel Pröfrock
Institute of Membrane Research, Helmholtz-Zentrum Hereon, Max-Planck-Str 1, 21502, Geesthacht, Germany
Anke-Lisa Höhme & Martin Held
Dr. J.T. MacDonald Foundation BioNIUM, Miller School of Medicine, University of Miami, Miami, FL, USA
Bahareh Ghane-Motlagh
Department of Biomedical Engineering, University of Miami, Miami, FL, USA
Ashutosh Agarwal
University of Colorado, Anschutz Medical Campus, Aurora, CO, USA
Justin Achua
Vita-Salute San Raffaele University, Milan, Italy
Edoardo Pozzi
IRCCS Ospedale San Raffaele, Urology, Milan, Italy
You can also search for this author in PubMed Google Scholar
Jason Codrington—conceptualization, methodology, investigation, project administration, data curation, visualization, writing—original draft, editing. Alexandra Aponte Varnum—investigation, writing—original draft, editing, data curation, visualization. Lars Hildebrandt—investigation, writing—original draft, validation, resources. Daniel Pröfrock—investigation, editing, validation, resources. Joginder Bidhan—resources, writing—original draft. Kajal Khodamoradi—project administration, resources. Anke-Lisa Höhme—investigation, visualization. Martin Held—writing—original draft, editing. Aymara Evans—writing—original draft. David Velasquez—writing—original draft. Christina C. Yarborough—writing—original draft. Bahareh Ghane-Motlagh—investigation. Ashutosh Agarwal—investigation. Justin Achua—writing—original draft. Edoardo Pozzi—editing. Francesco Mesquita—editing. Francis Petrella—writing—review. David Miller—writing—review. Ranjith Ramasamy—conceptualization, methodology, project administration, resources, supervision, editing, funding acquisition
Correspondence to Ranjith Ramasamy .
Competing interests.
Dr. Edoardo Pozzi is currently an Associate Editor for the International Journal of Impotence Research.
The study was approved by the Institutional Review Board of the University of Miami (Study # 20150740) and conducted following the Declaration of Helsinki. All patients provided written and informed consent to participate in the study.
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
Reprints and permissions
Cite this article.
Codrington, J., Varnum, A.A., Hildebrandt, L. et al. Detection of microplastics in the human penis. Int J Impot Res (2024). https://doi.org/10.1038/s41443-024-00930-6
Download citation
Received : 21 March 2024
Revised : 29 May 2024
Accepted : 04 June 2024
Published : 19 June 2024
DOI : https://doi.org/10.1038/s41443-024-00930-6
Anyone you share the following link with will be able to read this content:
Sorry, a shareable link is not currently available for this article.
Provided by the Springer Nature SharedIt content-sharing initiative
68 Accesses
Explore all metrics
Tumor microenvironment (TME) heterogeneity is an important factor affecting the treatment response of immune checkpoint inhibitors (ICI). However, the TME heterogeneity of melanoma is still widely characterized.
We downloaded the single-cell sequencing data sets of two melanoma patients from the GEO database, and used the “Scissor” algorithm and the “BayesPrism” algorithm to comprehensively analyze the characteristics of microenvironment cells based on single-cell and bulk RNA-seq data. The prediction model of immunotherapy response was constructed by machine learning and verified in three cohorts of GEO database.
We identified seven cell types. In the Scissor + subtype cell population, the top three were T cells, B cells and melanoma cells. In the Scissor − subtype, there are more macrophages. By quantifying the characteristics of TME, significant differences in B cells between responders and non-responders were observed. The higher the proportion of B cells, the better the prognosis. At the same time, macrophages in the non-responsive group increased significantly. Finally, nine gene features for predicting ICI response were constructed, and their predictive performance was superior in three external validation groups.
Our study revealed the heterogeneity of melanoma TME and found a new predictive biomarker, which provided theoretical support and new insights for precise immunotherapy of melanoma patients.
This is a preview of subscription content, log in via an institution to check access.
Price includes VAT (Russian Federation)
Instant access to the full article PDF.
Rent this article via DeepDyve
Institutional subscriptions
Data availability.
No datasets were generated or analysed during the current study.
Immune checkpoint inhibitors
Programmed cell death-1
Programmed cell death-ligand 1
Cytotoxic T lymphocyte-associated protein 4
Overall survival
Single-cell RNA sequencing
Gene Expression Omnibus
Complete remission
Partial remission
Stable disease
Progression-free survival
Progressive disease
Uniform Manifold Approximation and Projection
Differentially expressed genes
Gene ontology
Kyoto Encyclopedia of Genes and Genomes
Module Membership
Gene Significance
Support vector machine
Area under the curve
Immunotherapy response-related genes
Tertiary lymphoid structure
Pre-T-cell receptor
Chalmers ZR, Connelly CF, Fabrizio D, Gay L, Ali SM, Ennis R, et al. Analysis of 100,000 human cancer genomes reveals the landscape of tumor mutational burden. Genome Med. 2017;9(1):34.
Article PubMed PubMed Central Google Scholar
Pitcovski J, Shahar E, Aizenshtein E, Gorodetsky R. Melanoma antigens and related immunological markers. Crit Rev Oncol Hematol. 2017;115:36–49.
Article PubMed Google Scholar
Maida I, Zanna P, Guida S, et al. Translational control mechanisms in cutaneous malignant melanoma: the role of eIF2α. J Transl Med. 2019;17(1):20.
Arnold M, Singh D, Laversanne M, Vignat J, Vaccarella S, Meheus F, et al. Global Burden of Cutaneous Melanoma in 2020 and projections to 2040. JAMA Dermatol. 2022;158(5):495–503.
Li M, Long X, Bu W, Zhang G, Deng G, Liu Y, et al. Immune-related risk score: an immune-cell-pair-based prognostic model for cutaneous melanoma. Front Immunol. 2023;14:1112181.
Article CAS PubMed PubMed Central Google Scholar
Yu L, He R, Cui Y. Characterization of tumor microenvironment and programmed death-related genes to identify molecular subtypes and drug resistance in pancreatic cancer. Front Pharmacol. 2023;14:1146280.
Knackstedt T, Knackstedt RW, Couto R, Gastman B. Malignant melanoma: Diagnostic and Management Update. Plast Reconstr Surg. 2018;142(2):e202–16.
Article Google Scholar
Kessenbrock K, Plaks V, Werb Z. Matrix metalloproteinases: regulators of the tumor microenvironment. Cell. 2010;141(1):52–67.
Long GV, Swetter SM, Menzies AM, Gershenwald JE, Scolyer RA. Cutaneous melanoma. Lancet. 2023;402(10400):485–502.
Czarnecka AM, Sobczuk P, Rogala P, Świtaj T, Placzke J, Kozak K, et al. Efficacy of immunotherapy beyond RECIST progression in advanced melanoma: a real-world evidence. Cancer Immunol Immunother. 2022;71(8):1949–58.
Sharma P, Hu-Lieskovan S, Wargo JA, Ribas A. Primary, adaptive, and Acquired Resistance to Cancer Immunotherapy. Cell. 2017;168(4):707–23.
Ye D, Desai J, Shi J, Liu SM, Shen W, Liu T, Shi Y, et al. Co-enrichment of CD8-positive T cells and macrophages is associated with clinical benefit of tislelizumab in solid tumors. Biomark Res. 2023;11(1):25.
Junttila MR, de Sauvage FJ. Influence of tumour micro-environment heterogeneity on therapeutic response. Nature. 2013;501(7467):346–54.
Article CAS PubMed Google Scholar
Ren D, Hua Y, Yu B, Ye X, He Z, Li C, et al. Predictive biomarkers and mechanisms underlying resistance to PD1/PD-L1 blockade cancer immunotherapy. Mol Cancer. 2020;19(1):19.
He M, Roussak K, Ma F, Borcherding N, Garin V, White M, et al. CD5 expression by dendritic cells directs T cell immunity and sustains immunotherapy responses. Science. 2023;379(6633):eabg2752.
McDermott DF, Huseni MA, Atkins MB, Motzer RJ, Rini BI, Escudier B, et al. Clinical activity and molecular correlates of response to atezolizumab alone or in combination with bevacizumab versus sunitinib in renal cell carcinoma. Nat Med. 2018;24(6):749–57.
Ceci C, Atzori MG, Lacal PM, Graziani G. Targeting Tumor-Associated macrophages to increase the efficacy of Immune Checkpoint inhibitors: a glimpse into Novel Therapeutic approaches for metastatic melanoma. Cancers (Basel). 2020;12(11):3401.
Xiong D, Wang Y, You M. A gene expression signature of TREM2hi macrophages and γδ T cells predicts immunotherapy response. Nat Commun. 2020;11(1):5084.
Barrett T, Wilhite SE, Ledoux P, Evangelista C, Kim IF, Tomashevsky M, et al. NCBI GEO: archive for functional genomics data sets—update. Nucleic Acids Res. 2013;41(Database issue):D991–5.
CAS PubMed Google Scholar
Sade-Feldman M, Yizhak K, Bjorgaard SL, Ray JP, de Boer CG, Jenkins RW, et al. Defining T Cell States Associated with response to Checkpoint Immunotherapy in Melanoma. Cell. 2019;176(1–2):404.
Jerby-Arnon L, Shah P, Cuoco MS, Rodman C, Su MJ, Melms JC, et al. A Cancer Cell Program promotes T cell exclusion and resistance to checkpoint blockade. Cell. 2018;175(4):984–e99724.
Riaz N, Havel JJ, Makarov V, Desrichard A, Urba WJ, Sims JS, et al. Tumor and Microenvironment Evolution during Immunotherapy with Nivolumab. Cell. 2017;171(4):934–e94916.
Gide TN, Quek C, Menzies AM, Tasker AT, Shang P, Holst J, et al. Distinct Immune cell populations define response to Anti-PD-1 monotherapy and Anti-PD-1/Anti-CTLA-4 combined Therapy. Cancer Cell. 2019;35(2):238–e2556.
Hugo W, Zaretsky JM, Sun L, Song C, Moreno BH, Hu-Lieskovan S, et al. Genomic and transcriptomic features of response to Anti-PD-1 therapy in metastatic melanoma. Cell. 2016;165(1):35–44.
Lee JH, Shklovskaya E, Lim SY, Carlino MS, Menzies AM, Stewart A, et al. Transcriptional downregulation of MHC class I and melanoma de- differentiation in resistance to PD-1 inhibition. Nat Commun. 2020;11(1):1897.
Auslander N, Zhang G, Lee JS, Frederick DT, Miao B, Moll T, et al. Robust prediction of response to immune checkpoint blockade therapy in metastatic melanoma. Nat Med. 2018;24(10):1545–9.
Du K, Wei S, Wei Z, Frederick DT, Miao B, Moll T, et al. Pathway signatures derived from on-treatment tumor specimens predict response to anti-PD1 blockade in metastatic melanoma. Nat Commun. 2021;12(1):6023.
Butler A, Hoffman P, Smibert P, Papalexi E, Satija R. Integrating single-cell transcriptomic data across different conditions, technologies, and species. Nat Biotechnol. 2018;36(5):411–20.
Kang B, Camps J, Fan B, Jiang H, Ibrahim MM, Hu X, et al. Parallel single-cell and bulk transcriptome analyses reveal key features of the gastric tumor microenvironment. Genome Biol. 2022;23(1):265.
Zhang Y, Bai Y, Ma XX, Song JK, Luo Y, Fei XY, et al. Clinical-mediated discovery of pyroptosis in CD8 + T cell and NK cell reveals melanoma heterogeneity by single-cell and bulk sequence. Cell Death Dis. 2023;14(8):553.
Cao J, Spielmann M, Qiu X, Huang X, Ibrahim DM, Hill AJ, et al. The single-cell transcriptional landscape of mammalian organogenesis. Nature. 2019;566(7745):496–502.
Jin S, Guerrero-Juarez CF, Zhang L, Chang I, Ramos R, Kuan CH, et al. Inference and analysis of cell-cell communication using CellChat. Nat Commun. 2021;12(1):1088.
Sun D, Guan X, Moran AE, Wu LY, Qian DZ, Schedin P, et al. Identifying phenotype-associated subpopulations by integrating bulk and single-cell sequencing data. Nat Biotechnol. 2022;40(4):527–38.
Chu T, Wang Z, Pe’er D, Danko CG. Cell type and gene expression deconvolution with BayesPrism enables bayesian integrative analysis across bulk and single-cell RNA sequencing in oncology. Nat Cancer. 2022;3(4):505–17.
Langfelder P, Horvath S. WGCNA: an R package for weighted correlation network analysis. BMC Bioinformatics. 2008;9:559.
Wang H, Shao Y, Zhou S, Zhang C, Xiu N. Support Vector Machine Classifier via L0/1 soft-margin loss. IEEE Trans Pattern Anal Mach Intell. 2022;44(10):7253–65.
Reinhold WC, Sunshine M, Liu H, Varma S, Kohn KW, Morris J, et al. CellMiner: a web-based suite of genomic and pharmacologic tools to explore transcript and drug patterns in the NCI-60 Cell Line Set. Cancer Res. 2012;72(14):3499–511.
Tirosh I, Izar B, Prakadan SM, Wadsworth MH 2nd, Treacy D, Trombetta JJ, et al. Dissecting the multicellular ecosystem of metastatic melanoma by single-cell RNA-seq. Science. 2016;352(6282):189–96.
Chen C, Li Y, Zhou ZY, Sun GQ. An Immune-Related Gene Prognostic Index for Head and Neck squamous cell carcinoma. Clin Cancer Res. 2021;27(1):330–41.
Davidson G, Helleux A, Vano YA, Lindner V, Fattori A, Cerciat M, et al. Mesenchymal-like Tumor cells and Myofibroblastic Cancer-Associated fibroblasts are Associated with Progression and Immunotherapy Response of Clear Cell Renal Cell Carcinoma. Cancer Res. 2023;83(17):2952–69.
Bindea G, Mlecnik B, Tosolini M, Kirilovsky A, Waldner M, Obenauf AC, et al. Spatiotemporal Dynamics of Intratumoral Immune Cells Reveal the Immune Landscape in Human Cancer. Immunity. 2013;39(4):782–95.
Song P, Li W, Guo L, Ying J, Gao S, He J. Identification and validation of a Novel signature based on NK cell marker genes to Predict Prognosis and Immunotherapy Response in Lung Adenocarcinoma by Integrated Analysis of single-cell and bulk RNA-Sequencing. Front Immunol. 2022;13:850745.
Song P, Li W, Wu X, Qian Z, Ying J, Gao S, et al. Integrated analysis of single-cell and bulk RNA-sequencing identifies a signature based on B cell marker genes to predict prognosis and immunotherapy response in lung adenocarcinoma. Cancer Immunol Immunother. 2022;71(10):2341–54.
Shi X, Dong A, Jia X, Zheng G, Wang N, Wang Y, et al. Integrated analysis of single-cell and bulk RNA-sequencing identifies a signature based on T-cell marker genes to predict prognosis and therapeutic response in lung squamous cell carcinoma. Front Immunol. 2022;13:992990.
Becht E, Giraldo NA, Lacroix L, Buttard B, Elarouci N, Petitprez F, et al. Estimating the population abundance of tissue-infiltrating immune and stromal cell populations using gene expression. Genome Biol. 2016;17(1):218.
Schumacher TN, Thommen DS. Tertiary lymphoid structures in cancer. Science. 2022;375(6576):eabf9419.
Fridman WH, Sibéril S, Pupier G, Soussan S, Sautès-Fridman C. Activation of B cells in Tertiary lymphoid structures in cancer: anti-tumor or anti-self? Semin Immunol. 2023;65:101703.
Lindner S, Dahlke K, Sontheime K, Hagn M, Kaltenmeier C, Barth TF, et al. Interleukin-21-Induced Granzyme B-Expressing B lymphocytes infiltrate tumors and regulate T cells. Cancer Res. 2013;73(8):2468–79.
Zhang G, Gao Z, Guo X, Ma R, Wang X, Zhou P, et al. CAP2 promotes gastric cancer metastasis by mediating the interaction between tumor cells and tumor-associated macrophages. J Clin Invest. 2023;133(21):e166224.
Ostuni R, Kratochvill F, Murray PJ, Natoli G. Macrophages and cancer: from mechanisms to therapeutic implications. Trends Immunol. 2015;36(4):229–39.
Wildes TJ, Dyson KA, Francis C, Wummer B, Yang C, Yegorov O, et al. Immune escape after adoptive T-cell therapy for malignant gliomas. Clin Cancer Res. 2020;26(21):5689–700.
Chen S, Saeed AFUH, Liu Q, Jiang Q, Xu H, Xiao GG, et al. Macrophages in immunoregulation and therapeutics. Signal Transduct Target Ther. 2023;8(1):207.
Yu Y, Dai K, Gao Z, Tang W, Shen T, Yuan Y, et al. Sulfated polysaccharide directs therapeutic angiogenesis via endogenous VEGF secretion of macrophages. Sci Adv. 2021;7(7):eabd8217.
Aegerter H, Lambrecht BN, Jakubzick CV. Biology of Lung macrophages in health and disease. Immunity. 2022;55(9):1564–80.
Fridman WH, Meylan M, Petitprez F, Sun CM, Italiano A, Sautès-Fridman C. B cells and tertiary lymphoid structures as determinants of tumour immune contexture and clinical outcome. Nat Rev Clin Oncol. 2022;19(7):441–57.
Schmidt M, Micke P, Gehrmann M, Hengstler JG. Immunoglobulin kappa chain as an immunologic biomarker of prognosis and chemotherapy response in solid tumors. Oncoimmunology. 2012;1(7):1156–8.
Fridman WH, Meylan M, Petitprez F, Sun C-M, Italiano A, Sautès-Fridman C. B cells and tertiary lymphoid structures as determinants of tumour immune contexture and clinical outcome. Nat Rev Clin Oncol. 2022;19:441–57.
Cabrita R, Lauss M, Sanna A, Donia M, Skaarup Larsen M, Mitra S, et al. Tertiary lymphoid structures improve immunotherapy and survival in melanoma. Nature. 2020;577(7791):561–5. https://doi.org/10.1038/s41586-019-1914-8 .
Kim S, Song HS, Yu J, Kim YM. MiT Family transcriptional factors in Immune Cell functions. Mol Cells. 2021;44(5):342–55.
Rehli M, Lichanska A, Cassady AI, Ostrowski MC, Hume DA. TFEC is a macrophage-restricted member of the microphthalmia-TFE subfamily of basic helix-loop-helix leucine zipper transcription factors. J Immunol. 1999;162(3):1559–65.
Wang N, Zhou X, Wang X, Zhu X. Identification of Grb2-associated binding protein 3 expression to predict clinical outcomes and immunotherapeutic responses in lung adenocarcinoma. J Biochem Mol Toxicol. 2022;36(10):e23166.
Sliz A, Locker KCS, Lampe K, Godarova A, Plas DR, Janssen EM, et al. Gab3 is required for IL-2– and IL-15–induced NK cell expansion and limits trophoblast invasion during pregnancy. Sci Immunol. 2019;4(38):eaav3866.
Awasthi N, Liongue C, Ward AC. STAT proteins: a kaleidoscope of canonical and non-canonical functions in immunity and cancer. J Hematol Oncol. 2021;14(1):198.
Mogensen TH, IRF, Transcription Factors STAT. - from Basic Biology to roles in infection, protective immunity, and primary immunodeficiencies. Front Immunol. 2019;9:3047.
Recio C, Guerra B, Guerra-Rodríguez M, Aranda-Tavío H, Martín-Rodríguez P, de Mirecki-Garrido M, et al. Signal transducer and activator of transcription (STAT)-5: an opportunity for drug development in oncohematology. Oncogene. 2019;38(24):4657–68.
Salas A, Hernandez-Rocha C, Duijvestein M, Faubion W, McGovern D, Vermeire S, et al. JAK-STAT pathway targeting for the treatment of inflammatory bowel disease. Nat Rev Gastroenterol Hepatol. 2020;17(6):323–37.
Wang H, Zeng X, Fan Z, Lim B. RhoH modulates pre-TCR and TCR signalling by regulating LCK. Cell Signal. 2011;23(1):249–58.
Jiang B, Weinstock DM, Donovan KA, Sun HW, Wolfe A, Amaka S, et al. ITK degradation to block T cell receptor signaling and overcome therapeutic resistance in T cell lymphomas. Cell Chem Biol. 2023;30(4):383–e3936.
Gu Y, Jasti AC, Jansen M, Siefring JE. RhoH, a hematopoietic-specific rho GTPase, regulates proliferation, survival, migration, and engraftment of hematopoietic progenitor cells. Blood. 2005;105:1467–75.
Guo F, Cheng X, Jing B, Wu H, Jin X. FGD3 binds with HSF4 to suppress p65 expression and inhibit pancreatic cancer progression. Oncogene. 2022;41(6):838–51.
Zhu J, Hao S, Zhang X, Qiu J, Xuan Q, Ye L. Integrated Bioinformatics Analysis Exhibits Pivotal Exercise-Induced genes and corresponding pathways in malignant melanoma. Front Genet. 2021;11:637320.
Huang L, Zhong L, Cheng R, Chang L, Qin M, Liang H, et al. Ferroptosis and WDFY4 as novel targets for immunotherapy of lung adenocarcinoma. Aging. 2023;15(18):9676–94.
Conti BJ, Davis BK, Zhang J, O’connor W KL Jr, Ting JP. CATERPILLER 16.2 (CLR16.2), a novel NBD/LRR family member that negatively regulates T cell function. J Biol Chem. 2005;280(18):18375–85.
Caraux A, Kim N, Bell SE, Zompi S, Ranson T, Lesjean-Pottier S, et al. Phospholipase C-gamma2 is essential for NK cell cytotoxicity and innate immunity to malignant and virally infected cells. Blood. 2006;107(3):994–1002.
Qiu C, Shi W, Wu H, Zou S, Li J, Wang D, et al. Identification of Molecular subtypes and a prognostic signature based on inflammation-related genes in Colon Adenocarcinoma. Front Immunol. 2021;12:769685.
Download references
Here, we thank the GEO, BioProject, and EGA databases for providing relevant data to support our studies.
This research was funded by the National Youth Science Foundation Project, grant number (82204159), Science and Technology Research Program of Chongqing Municipal Education Commission (Grant No. KJQN202300423), Chongqing Maternal and Child Disease Prevention and Control and Public Health Research Center Open Project (Grant No. CQFYJB01001), and the Chongqing Postgraduate Scientific Research and Innovation Project in 2023 (No. CYS23355).
Authors and affiliations.
Department of Epidemiology and Health Statistics, School of Public Health, Chongqing Medical University, Yixue Road, Chongqing, 400016, China
Yuan Zhang, Cong Zhang, Jing He, Guichuan Lai, Wenlong Li, Haijiao Zeng, Xiaoni Zhong & Biao Xie
Research Center for Medicine and Social Development, Chongqing Medical University, Chongqing, China
You can also search for this author in PubMed Google Scholar
Conceptualization, Y.Z. and C.Z.; methodology, Y.Z.; software, Y.Z. and J.H.; validation, G.L.; formal analysis, W.L.; investigation, H.Z.; resources, X.Z.; data curation, J.H.; writing-original draft preparation, Y.Z.; writing-review and editing, Y.Z.; visualization, C.Z.; supervision, X.Z.; project administration, B.X.; funding acquisition, B.X.; All authors have read and agreed to the published version of the manuscript.
Correspondence to Xiaoni Zhong or Biao Xie .
Ethics approval and consent to participate.
Not applicable.
Competing interests.
The authors declare no competing interests.
Communicated by John Di Battista.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Below is the link to the electronic supplementary material.
Supplementary material 2, rights and permissions.
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
Reprints and permissions
Zhang, Y., Zhang, C., He, J. et al. Comprehensive analysis of single cell and bulk RNA sequencing reveals the heterogeneity of melanoma tumor microenvironment and predicts the response of immunotherapy. Inflamm. Res. (2024). https://doi.org/10.1007/s00011-024-01905-5
Download citation
Received : 26 March 2024
Revised : 07 June 2024
Accepted : 09 June 2024
Published : 19 June 2024
DOI : https://doi.org/10.1007/s00011-024-01905-5
Anyone you share the following link with will be able to read this content:
Sorry, a shareable link is not currently available for this article.
Provided by the Springer Nature SharedIt content-sharing initiative
COMMENTS
Factor Analysis Steps. Here are the general steps involved in conducting a factor analysis: 1. Define the Research Objective: Clearly specify the purpose of the factor analysis. Determine what you aim to achieve or understand through the analysis. 2. Data Collection: Gather the data on the variables of interest.
The first methodology choice for factor analysis is the mathematical approach for extracting the factors from your dataset. The most common choices are maximum likelihood (ML), principal axis factoring (PAF), and principal components analysis (PCA). You should use either ML or PAF most of the time.
Factor analysis methods can be incredibly useful tools for researchers attempting to establish high quality measures of those constructs not directly observed and captured by observation. Specifically, the factor solution derived from an Exploratory Factor Analysis provides a snapshot of the statistical relationships of the key behaviors ...
Overview. Factor analysis is a method for modeling observed variables and their covariance structure in terms of unobserved variables (i.e., factors). There are two types of factor analyses, exploratory and confirmatory. Exploratory factor analysis (EFA) is method to explore the underlying structure of a set of observed variables, and is a ...
Overview. Factor Analysis is a method for modeling observed variables, and their covariance structure, in terms of a smaller number of underlying unobservable (latent) "factors.". The factors typically are viewed as broad concepts or ideas that may describe an observed phenomenon. For example, a basic desire of obtaining a certain social ...
Factor analysis isn't a single technique, but a family of statistical methods that can be used to identify the latent factors driving observable variables. Factor analysis is commonly used in market research , as well as other disciplines like technology, medicine, sociology, field biology, education, psychology and many more.
Exploratory factor analysis (EFA) is one of a family of multivariate statistical methods that attempts to identify the smallest number of hypothetical constructs (also known as factors, dimensions, latent variables, synthetic variables, or internal attributes) that can parsimoniously explain the covariation observed among a set of measured variables (also called observed variables, manifest ...
Factor analysis is a commonly used data reduction statistical technique within the context of market research. The goal of factor analysis is to discover relationships between variables within a dataset by looking at correlations. ... but it's important to understand that there is different math going on behind the scenes for each method. Types ...
Types of factor analysis There are two basic forms of factor analysis, explorator y and confirmator y. Here's how they are used to add value to your research process. Confirmatory factor analysis In this type of analysis, the researcher starts out with a hypothesis about their data that they are looking to prove or disprove. Factor analysis will
The most substantive part of the chapter focuses on six steps of EFA. More specifically, we consider variable (or indicator) selection (Step 1), computing the variance-covariance matrix (Step 2), factor-extraction methods (Step 3), factor-retention procedures (Step 4), factor-rotation methods (Step 5), and interpretation (Step 6).
The formula for deriving the communalities is where a equals the loadings for j variables. Using the factor loadings in Table 1, we then calculate the communalities using the aforementioned formula, thus. = 0.78. The values in the table represent the factor loadings and how much the variable contributes to. Figure 2.
Purpose. This seminar is the first part of a two-part seminar that introduces central concepts in factor analysis. Part 1 focuses on exploratory factor analysis (EFA). Although the implementation is in SPSS, the ideas carry over to any software program. Part 2 introduces confirmatory factor analysis (CFA).
Describes various commonly used methods of initial factoring and factor rotation. In addition to a full discussion of exploratory factor analysis, confirmatory factor analysis and various methods of constructing factor scales are also presented
Pros and Cons of Factor Analysis . Having learned about Factor Analysis in detail, let us now move on to looking closely into the pros and cons of this statistical method. Pros of Factor Analysis . Measurable Attributes . The first and foremost pro of FA is that it is open to all measurable attributes.
Confirmatory factor analysis can be seen as a special case of structural equation modeling. This entry presents a short overview over the concept and history of confirmatory factor analysis and explains the basic mathematical fundamentals. Then, it shows options to decide about the model fit and discusses multiple group comparisons.
Factor analysis is a sophisticated statistical method aimed at reducing a large number of variables into a smaller set of factors. This technique is valuable for extracting the maximum common variance from all variables, transforming them into a single score for further analysis. As a part of the general linear model (GLM), factor analysis is ...
Factor analysis is a powerful data reduction technique that enables researchers to investigate concepts that cannot easily be measured directly. By boiling down a large number of variables into a handful of comprehensible underlying factors, factor analysis results in easy-to-understand, actionable data.
Factor Analysis Factor analysis is used to uncover the latent structure of a set of variables. It reduces attribute space from a large no. of variables to a smaller no. of factors and as such is a non dependent procedure. Factor analysis could be used for any of the following purpose- 1.
This guide further explains various parts and parcels of factor analysis: (1) the process of factor loading on a specific survey case, (2) the identification process for an appropriate number of factors and optimal combination of factors, depending on the specific research design and goals, and (3) an explanation of dimensions, their reduction ...
There are different methods that we use in factor analysis from the data set: 1. Principal component analysis. It is the most common method which the researchers use. Also, it extracts the maximum variance and put them into the first factor. Subsequently, it removes the variance explained by the first factor and extracts the second factor.
practice of rotation in factor analysis, it is strongly recommended to try several sizes for the subspace of the retained factors in order to assess the robustness of the interpretation of the rotation. Notations 1In: Lewis-Beck M., Bryman, A., Futing T. (Eds.) (2003). Encyclopedia of Social Sciences Research Methods. Thousand Oaks (CA): Sage.
Chapter 1. Theoretical In tro duction. • Factor analysis is a collection of methods used to examine how underlying constructs influence the. resp onses on a n umber of measured v ariables ...
Clustered data are a complex and frequently used type of data. Traditional factor analysis methods are effective for non-clustered data, but they do not adequately capture correlations between multiple observed individuals or variables in clustered data. This paper proposes a Bayesian approach utilizing MCMC and Gibbs sampling algorithms to accurately estimate parameters of interest within the ...
Research Methods. The first step of a research methodology is to identify a focused research topic, which is the question you seek to answer. By setting clear boundaries on the scope of your research, you can concentrate on specific aspects of a problem without being overwhelmed by information. This will produce more accurate findings.
When it comes to preferred methods of communication, many workers prefer old-fashioned tools. Email is the most popular tool, with 18% of total respondents marking it as their preference (25% of ...
The results of phenome-wide research showed that HMGCR was highly correlated with low-density lipoprotein, and TGFB3 was primarily associated with insulin-like growth factor 1 levels. This study utilized MR and colocalization analysis to identify 21 potential drug targets for migraine, two of which were significant in both blood and brain.
Statistical analysis. Multiple MR methods were used to infer causal relationships between thyroid diseases and DVT, including the inverse variance weighted (IVW), weighted median, and MR-Egger tests, after harmonizing the SNPs across the GWASs of exposures and outcomes. The main analysis was conducted using the IVW method.
Seven types of MPs were found in the penile tissue, with polyethylene terephthalate (47.8%) and polypropylene (34.7%) being the most prevalent. ... Our research adds a key dimension to the ...
Discover method in the Methods Map. Foundations Add to list Added to list . Factor Analysis. By: ... Methods: Factor analysis, Exploratory factor analysis, Covariance matrix; Length: 5k+ Words DOI: https:// doi. org/10.4135 ... About Sage Publishing About Sage Research Methods Accessibility Author Guidelines AI/LLM CCPA ...
Background Tumor microenvironment (TME) heterogeneity is an important factor affecting the treatment response of immune checkpoint inhibitors (ICI). However, the TME heterogeneity of melanoma is still widely characterized. Methods We downloaded the single-cell sequencing data sets of two melanoma patients from the GEO database, and used the "Scissor" algorithm and the "BayesPrism ...