Sequence-based identification of amyloidogenic β-hairpins reveals a prostatic acid phosphatase fragment promoting semen amyloid formation

β-Structure-rich amyloid fibrils are hallmarks of several diseases, including Alzheimer’s (AD), Parkinson’s (PD), and type 2 diabetes (T2D). While amyloid fibrils typically consist of parallel β-sheets, the anti-parallel β-hairpin is a structural motif accessible to amyloidogenic proteins in their monomeric and oligomeric states. Here, to investigate implications of β-hairpins in amyloid formation, potential β-hairpin-forming amyloidogenic segments in the human proteome were predicted based on sequence similarity with β-hairpins previously observed in Aβ, α-synuclein, and islet amyloid polypeptide, amyloidogenic proteins associated with AD, PD, and T2D, respectively. These three β-hairpins, established upon binding to the engineered binding protein β-wrapin AS10, are characterized by proximity of two sequence segments rich in hydrophobic and aromatic amino acids, with high β-aggregation scores according to the TANGO algorithm. Using these criteria, 2505 potential β-hairpin-forming amyloidogenic segments in 2098 human proteins were identified. Characterization of a test set of eight protein segments showed that seven assembled into Thioflavin T-positive aggregates and four formed β-hairpins in complex with AS10 according to NMR. One of those is a segment of prostatic acid phosphatase (PAP) comprising amino acids 185–208. PAP is naturally cleaved into fragments, including PAP(248−286) which forms functional amyloid in semen. We find that PAP(185−208) strongly decreases the protein concentrations required for fibril formation of PAP(248−286) and of another semen amyloid peptide, SEM1(86−107), indicating that it promotes nucleation of semen amyloids. In conclusion, β-hairpin-forming amyloidogenic protein segments could be identified in the human proteome with potential roles in functional or disease-related amyloid formation.


Introduction
Amyloid fibrils are a structural key feature of several prevalent diseases including Alzheimer's disease (AD), Parkinson's disease (PD), and type 2 diabetes (T2D), but also fulfill functional roles in biology [1][2][3][4].Their principal framework is the cross-β structure, in which β-strands oriented perpendicular to the fibril axis form long β-sheets, stabilized by backbone hydrogen bonds (Fig. 1A).Amyloid fibrils consist of two or more such β-sheets that pack against each other, supported by interactions between their side chains.In the fibrils of disease-related amyloidogenic proteins a parallel in-register β-sheet organization is commonly established, i.e., identical side chains of neighboring monomer units stack on top of each other [1,4].Fibrils grow by stepwise addition of monomers, which form multiple β-strand-kink-β-strand motifs.The different β-strand forming segments of one monomer can make contact with each other via side chain-side chain interactions (Fig. 1A).This β-strand-kink-β-strand conformation is not stable in free monomers, but is stabilized in the amyloid fibril by the hydrogen bond network and side chain interactions within the highly ordered intermolecular β-sheets.
Here, we aimed to identify further human proteins containing proving Thioflavin T (ThT)-positive aggregation for seven and β-hairpin formation for four of the candidates.Further analysis of one segment, which stems from prostatic acid phosphate (PAP), demonstrated a promoting effect on physiologically relevant PAP semen amyloid formation.

Bioinformatics
The 64-bit Linux binary of the TANGO algorithm 2.3.1 [44][45][46] was downloaded from http://tango.crg.es.The annotated human proteome was acquired on the February 21st 2017 from Uniprot [47] using the query: reviewed:yes AND organism:"Homo sapiens (Human) [9606]" AND proteome:up000005640 and downloading the amino acid sequences as a FASTA file (20,162 protein sequences).The sequences were then prepared for the TANGO algorithm by creating files with up to one thousand lines (TANGO limit) all having the following structure: "uniprot ID" N N 7 298 0.1 "aa_sequence", corresponding to running the TANGO algorithm on all sequences assuming no modifications of termini, T = 298 K, and an ionic strength of 0.1 M (default parameters of TANGO).These batch files were then supplied to a locally running instance of TANGO.For unknown reasons some sequences resulted in errors, both when run locally as well as when the amino acid sequences were supplied to the online version of TANGO, and they were therefore excluded from any analysis.The Uniprot IDs of the error causing sequences were: Q8NHP1, P22352, P59796, P18283, P07203, Q9C0D9, P36969, Q8IZQ5, Q9BVL4, Q9BQE4, Q8WWX9, P49908, P62341, P59797, Q9NZV5, P63302, Q9Y6D0, O60613, Q99611, Q9NNW7, Q86VQ6, Q8WZ42, P49895, Q92813, P55073, Q9NZV6, Q16881.A filter was applied based on the length of aggregation prone β-stretches, the length of the turn between the stretches, presence of aromatic residues, presence of hydrophobic residues, and the presence of glycine.The parameters were based on three previously identified aggregation-prone β-hairpins in Aβ, αSyn, and IAPP [37][38][39].In particular, the following filter was developed and applied to the output from TANGO: 1.Only amino acids with a TANGO aggregation propensity > 1.06 were considered to be in β-stretch.2. Only β-stretches between 5 and 23 amino acids long were considered for further analysis.3.Only β-stretches connected to other β-stretches by 9 or fewer amino acids (i.e potential turns) were considered for further analysis.4.Only β-stretches where the first (i.e.N-terminal strand) had the following properties were considered: 41.At least one aromatic residue.42.A fractional content of the residues isoleucine, leucine, and valine between 0.32 and 0.5.

5.
Only β-stretches where the second (i.e.C-terminal strand) had the following properties were considered: 51. Between 1 and 3 glycine residues.52.A fractional content of the residues isoleucine, leucine, and valine between 0.32 and 0.61.
This resulted in 2505 potentially aggregation prone β-hairpins from 2098 protein sequences.The entire workflow is available as a Jupyter Notebook except the TANGO binary which should be obtained from the above mentioned source.Code and output sequences are available on zenodo under doi:10.5281/zenodo.8264535.

Peptides
Synthetic peptides were obtained either from peptides & elephants or from CASLO.To fully monomerize the peptides, 1 mg aliquots were solubilized in 1 mL hexafluoroisopropanol (HFIP), aliquoted in smaller amounts, lyophilized and stored at − 20 • C.

Cloning of the fusion constructs
Like other β-wrapins derived from the affibody ZAβ3, AS10 is a dimer of two identical subunits [43].In the fusion constructs the two AS10 subunits and the β-hairpin-forming peptides were linked via flexible glycine-serine linkers in the following format: N-His 6 -AS10(sub-unit1)-(G 4 S) 3 -AS10(subunit2)-(G 4 S) 3 -HairpinPeptide-C.DNA fragments containing the sequence of the fusion constructs were obtained from Thermo Fisher Scientific and cloned into vector pET302/NT-His using In-Fusion HD EcoDry Cloning Kit (Takara Bio).The success of the cloning was verified by sequencing.

Expression and purification of the fusion constructs
Expression of the fusion constructs was done in E. coli JM109 DE3 cells.An overnight pre-culture was grown in 2YT medium supplemented with 100 µg/mL ampicillin at 37 • C and 180 rpm.For a subsequent overnight pre-culture, M9 minimal medium supplemented with 100 µg/ mL ampicillin was inoculated with the 2YT pre-culture at an optical density of 0.05 and grown at the same conditions as the previous culture.An expression culture in M9 minimal medium supplemented with 100 µg/mL ampicillin, 15 N-NH 4 Cl (1 g/L) and 13 C 6 -glucose (2 g/L) was inoculated with the M9 pre-culture at an optical density of 0.05 and grown at 37 • C and 140 rpm until induction with Isopropyl-β-D-thiogalactopyranosid (IPTG), then incubated at 20 • C, 120 rpm for additional 20 h before harvesting.Cells were harvested by centrifugation at 5000 rpm for 10 min at 4 • C and frozen in loading buffer (50 mM Tris (hydroxymethyl)aminomethane (Tris), pH 8, 500 mM NaCl, 20 mM imidazole) with protease inhibitor (cOmplete EDTA-free Protease Inhibitor Cocktail, Roche).After thawing, cells were sonicated on ice twice for 5 min with the sonication probe MS-72, with an amplitude of 35% and a cycle of 3 s pulse and 5 s pause.Next, the cell debris was removed by centrifugation and the supernatant was applied to a NiNTA HisTrap FF column (GE Healthcare) in loading buffer.Protein was eluted with a mixture of loading buffer and elution buffer (50 mM Tris, pH 8, 500 mM NaCl, 500 mM imidazole) into a final concentration of 250 mM imidazole.The eluate was concentrated and applied to a size exclusion chromatography column (SEC Superdex 200 Increase 10/300 GL, GE Healthcare) in a sodium phosphate buffer (20 mM NaPi, pH 7.4, 50 mM NaCl).The purified fusion protein was frozen in liquid nitrogen and stored at − 80 • C.

Thioflavin T (ThT) aggregation assay
All ThT aggregation assays were done in Greiner 96-well half-area, clear bottom, low-binding plates.The measurements themselves were done in a BMG CLARIOstar plate reader at 37 • C and 300 rpm shaking in orbital mode with one glass bead (0.75-1 mm, Roth) per well.Wells with samples were always surrounded by wells filled with liquid, either other samples or 150 μl water, to counter potential artefacts due to evaporation.For characterization of the aggregation of the peptide test set, peptides were dissolved in 20 mM NaPi, pH 7.4, 50 mM NaCl, and added to the wells at final concentrations of 20 µM (CYP2W1, FibA, 3-PGDH, MIC26), 25 µM (TTC5, NDUFS7), 500 µM (NBPF1) or 550 µM (PAP) in triplicate samples of 80 μl.Final concentrations of 25 µM ThT and 0.04% sodium azide (NaN 3 ) were added.The plate was sealed with sealing tape and put into the plate reader, heated to 37 • C, and ThT fluorescence was read in 1000 cycles of 300 s each using bottom optics with excitation at 448 nm (10 nm bandwidth) and emission at 482 nm (10 nm bandwidth).To investigate the effect of PAP(185− 208) on the aggregation of PAP(248− 286) and SEM1(86− 107), the prepared peptides were dissolved in 50 mM Tris, pH 8.5.PAP(248− 286) and SEM1 (86− 107) were kept at constant concentrations of 500 and 300 µM, respectively.The final concentration of PAP(185− 208) was varied between 25 and 100 µM.Samples contained 40 µM ThT and 0.04% NaN 3 .

Atomic force microscopy (AFM)
For AFM imaging, 2 μl samples were taken out of the wells at the end of the ThT aggregation assays, put onto a freshly cleaved muscovite mica surface and dried during incubation for 10 min under the clean bench.
Subsequently, the samples were washed with 100 μl ddH 2 O in three steps and dried with a stream of N 2 gas.Imaging was performed in intermittent contact mode (AC mode) in a JPK NanoWizard 3 atomic force microscope using a silicon cantilever with silicon tip (OMCL-AC160TS-R3, Olympus) with a typical tip radius of 9 ± 2 nm, a force constant of 26 N/m and resonance frequency around 300 kHz.The images were processed using JPK DP Data Processing Software (version spm-5.0.84).

Isothermal titration calorimetry (ITC)
ITC was performed at 30 • C using a Microcal iTC200 calorimeter (GE Healthcare).AS10 was filled into the cell at a concentration of 30 µM in 20 mM sodium phosphate, pH 7.4, 50 mM NaCl.Peptides were titrated from the syringe in approximately 10-fold higher concentrations.Heats of post-saturation injections were averaged and subtracted from each injection to correct for heats of dilution and mixing.Dissociation constants were obtained from a nonlinear least-squares fit to a 1:1 binding model using MicroCal Origin.

Structure modeling
The structures of AS10 in complex with NDUFS7(104− 129), PAP (185− 208), or CYP2W1(176− 204) were initially modeled using the structure of the AS10:Aβ complex [51] as a template.The residues of Aβ forming β-sheets with AS10 were first mutated to the residues of the investigated β-hairpin peptides forming β-sheets with AS10 according to the experimentally derived β-sheet registries.Subsequently, the loop residues of Aβ (VGSNKG) were deleted, and the loop residues of the peptides were modeled using Modeller [52].The resulting structures were used as initial structures for 5-replicate MD simulations, which were performed using the same methods as in our previous studies on AS10 binding to Aβ, αSyn, and IAPP [51,53].Structural analysis comparing the simulated complex structures with experimental β-sheet registries and hydrogen atom distance constraints suggested that the MD simulations efficiently modeled the complex structures.

Prediction of amyloidogenic β-hairpins
To identify potential β-hairpin-forming amyloidogenic segments in the human proteome, we aimed to search for protein segments with sequence properties similar to those of the β-hairpins of Aβ, αSyn and IAPP established in their complexes with the engineered binding proteins (Fig. 2).We therefore looked for commonalities among these three β-hairpin sequences that might serve as criteria in the search for further amyloidogenic β-hairpins.When comparing the sequences of the Aβ, αSyn and IAPP β-hairpins directly, no obvious sequence similarity exists (Fig. 2B).Moreover, secondary structure prediction did not reveal a common theme among the three sequences that could be used as a search pattern (Fig. S1).As we are specifically interested in β-hairpins capable of amyloid formation, amyloid propensity predictors are of interest.The TANGO algorithm predicts secondary structure and also calculates amyloid propensity, or β-aggregation propensity as TANGO denotes it, enabling identification of aggregation-prone regions (APRs) [44][45][46].When applying TANGO to Aβ, αSyn and IAPP, a consistent pattern was observed where most of residues in the β-strands of the β-hairpins had high β-aggregation propensity scores and the parts corresponding to the turns had scores of zero (Fig. 2 C).Hence, a first search criterion was the presence of two regions with nonzero β-aggregation propensity scores separated by a segment of zero β-aggregation propensity.More specifically, sequences covering two stretches with between 5 and 23 consecutive residues with β-aggregation propensity scores ≥ 1.06 separated by maximally nine residues with a β-aggregation propensity score of zero were searched for (Fig. 3A).This resulted in 26,715 potential β-hairpins forming amyloidogenic regions distributed over 10,711 proteins out of a total of 20,162 analyzed human protein sequences.In other words, 53% of all sequences contained an average of 2.5 potential β-hairpin regions as judged by the TANGO search criterion.This high number is in line with the observation that the majority of human proteins contain multiple aggregation-prone regions (APRs) as assessed by computational predictions [54].
For a more stringent search pattern, additional requirements were imposed on the type of amino acids in the β-aggregation propensitypositive stretches, based on the specific sequence compositions of the Aβ, αSyn and IAPP β-hairpins.In particular, hydrophobicity and aromaticity were considered, as aromatic and hydrophobic amino acids have been implicated in amyloid formation [55,56].Slightly different requirements were placed on the N-terminal and C-terminal β-aggregation propensity-positive stretch to reflect the known properties of the Aβ, αSyn and IAPP β-hairpins.In the N-terminal stretch, at least one aromatic residue and a fractional content of the large aliphatic residues leucine, isoleucine, and valine between 0.32 and 0.5 of all residues were required.In the C-terminal stretch, between 1 and 3 glycine residues and a fractional content of the large aliphatic residues between 0.32 and 0.61 of all residues were required (Fig. 3A).The use of these additional requirements resulted in a reduction of hits to 2505 potential β-hairpin regions distributed over 2098 proteins (Fig. 3B).In other words, 10% of the human protein sequences contained an average of 1.2 potential β-hairpins.Code and output sequences of the prediction of amyloidogenic β-hairpins are available on zenodo under doi:10.5281/zenodo.8264535.

Characterization of a test set of potential amyloidogenic β-hairpin protein fragments
To evaluate whether the identified protein segments are indeed aggregation-prone and able to adopt β-hairpin conformation a test set of eight segments was investigated in peptide format (see Fig. 4A for source proteins, peptide names and sequences, and an overview of characterization results).We included in the test set the two segments that stem from proteins that are associated with the keyword 'amyloid' in Uniprot, namely fibrinogen alpha chain and prostatic acid phosphatase, as these might have an increased relevance in the context of amyloid formation.The other six segments were chosen randomly; however, transmembrane proteins were not considered.Apart from the predicted β-hairpin sequences, the peptides included up to three additional, mostly charged or polar amino acids at their termini, in accordance with the sequences of the source proteins, with the aim to a) increase peptide hydrophilicity and b) ensure that β-hairpin formation is not impeded by insufficient peptide length.In the sequence context of their source proteins, the segments adopt diverse secondary and tertiary structures (Fig. 4B).In one case, TTC5(335− 361), the segment forms a β-hairpin in the source protein's native fold.Here, we investigated the segments' behavior when excised from the source protein.
When dissolved in aqueous buffer at µM concentrations at neutral pH, four of the eight peptides exhibited extensive precipitate formation indicative of low solubility (Fig. 4A).To test if any of the eight peptides form amyloid, the dye Thioflavin T (ThT), which shows dramatically increased fluorescence upon binding to amyloid fibrils [57], was added to 20-25 µM solutions of the peptides and time courses of ThT fluorescence were recorded (Fig. 5).Only in two cases, NBPF1(928− 946) and PAP(185− 208), no increase in fluorescence intensity was detected.When the peptide concentration was increased to 500-550 µM, PAP (185− 208) exhibited ThT fluorescence (Fig. 5D), whereas NBPF1 (928− 946) remained ThT negative (Fig. 5E).For five peptides, ThT fluorescence was initially low but increased rapidly after a lag-time, which is characteristic of amyloid formation (Fig. 5A,B,D,F,G).Two of these five peptides, FibA(655− 678) and 3-PGDH(298− 319), belonged to the group showing immediate precipitate formation upon dissolution, indicating that they can form both ThT-positive as well as ThT-negative aggregates.For two further peptides exhibiting immediate precipitation, CYP2W1(176− 204) and MIC26(153− 178), ThT fluorescence was high from the beginning, indicating that ThT-positive structures had formed directly upon sample preparation (Fig. 5C,H).We aspired to image the aggregates at the end of the aggregation assays by AFM, but the amount of material on the substrate was typically sparse, which we attribute to a poor interaction of the comparatively hydrophobic peptides with the hydrophilic mica substrate.Nevertheless, fibrillar aggregates confirming amyloid formation were detected of NDUFS7(104− 129), PAP (185− 208), CYP2W1(176− 204), and MIC26(153− 178) (Fig. 5).Collectively, these initial experiments showed that only one of the eight peptides remained soluble, whereas the other seven formed aggregates with features typical for amyloids.
We next applied ITC to test if the peptides have affinity for β-wrapin AS10, which binds Aβ, αSyn and IAPP each with sub-micromolar affinity and concomitantly induces β-hairpin conformation [43].This was investigated only for the four peptides that did not precipitate in aqueous buffer at neutral pH, as heats resulting from precipitate formation or dissolution would superpose the ITC signal that stems from binding.For the three peptides that exhibited an increase in ThT fluorescence in the course of the aggregation assays, TTC5(335− 361), NDUFS7(104− 129), and PAP(185− 208), heat signals in agreement with 1:1 binding were observed, yielding dissociation constants of 4.7 ± 0.9 µM, 4.7 ± 0.5 µM, and 39 ± 5 µM, respectively (Figs. 4A and 6).
In contrast, heat signals were not observed for the peptide NBPF1 (928− 946) that was soluble and did not show ThT fluorescence.
To test for β-hairpin formation in the peptides upon binding to AS10 we performed NMR spectroscopy. 1 H- 15 N HSQC NMR spectroscopy using 15 N-labeled AS10 ( 15 N-AS10) and unlabeled binding partner leads to characteristic changes upon β-hairpin folding coupled to binding.These include a great increase in resonance dispersion, appearance of four amide proton resonances in the glycine region ( 15 N chemical shift <110 ppm) stemming from Gly-13 and Gly-14 in the two AS10-subunits, as well as appearance of amide proton resonances with shift values typical for β-sheet conformation (high 1 H and high 15 N chemical shifts) [43].This signature was observed here when 100 µM 15 N-AS10 was mixed with a 20% molar excess of one of the three peptides TTC5 (335− 361), NDUFS7(104− 129) or PAP(185− 208), which possess affinity for AS10 according to ITC (Fig. 7A-C).Interestingly, this signature was also observed for CYP2W1(176− 204), which formed precipitates upon dissolution, indicating that this peptide can also partition into a 1:1 complex with AS10 (Fig. 7D).In contrast, no change in the 1 H- 15 N HSQC NMR spectrum of 15 N-AS10 was observed upon addition of the other four peptides, which is in line with the absence of AS10 affinity for NBPF1(928− 946) demonstrated by ITC and with the insolubility of FibA (655− 678), 3-PGDH(298− 319), and MIC26(153− 178).It is possible the latter three peptides would engage in an interaction with AS10 and adopt β-hairpin structure when brought in contact with AS10 before precipitating into aggregates; however, this was not investigated further.
To determine the registries of the β-hairpins of TTC5(335− 361), NDUFS7(104− 129), PAP(185− 208), and CYP2W1(176− 204) formed in complex with AS10, we performed NMR spectroscopy on the 13 C, 15 Nlabeled complexes of peptides with AS10.To circumvent cost-intensive chemical synthesis of isotopically labeled peptides, fusion constructs of AS10 at the N-terminus and the respective peptide at the C-terminus,  15 N-labeled AS10 bound to the unlabeled peptides (Fig. 7B-D) were recovered in the 13 C, 15 N-labeled fusion constructs, demonstrating that the structure of the complex is not affected by the fusion (Fig. 8A).In contrast, this was not true for the AS10 fusion construct of TTC5(335− 361), which prohibited determination of the β-hairpin registry in this case.For registry determination, the resonances stemming from the β-hairpin motif were assigned by standard triple resonance heteronuclear NMR techniques and characteristic NOE contacts involving NH and Hα protons in the backbones of the peptides and of AS10 were identified.For all three peptides, the NOE contacts were compatible with only one unique β-hairpin registry (Fig. 8B and S2).The β-hairpin registries determined for PAP(185− 208), CYP2W1(176− 204), and NDUFS7(104− 129) are shown in Fig. 8C-E along with those of Aβ, αSyn and IAPP (Fig. 8F).
As the general architecture of all six intermolecular β-sheets was identical in all the cases (Fig. 8C-F), we modeled the structures of the AS10 complexes of PAP(185− 208), CYP2W1(176− 204), and NDUFS7 (104− 129) using the known structures of β-wrapins with Aβ, αSyn and IAPP as starting point.Specifically, the template used for initial modeling of the AS10 complexes of the peptides was the structure of the Aβ:AS10 complex [51].We first mutated the Aβ residues that form the two β-strands to the residues of the peptides at the corresponding position in the experimentally derived β-sheet registries.Subsequently, the loop residues of Aβ (VGSNKG) were replaced by the loop residues of the individual peptides, modeled using Modeller [52].The resulting structures were used as initial structures for MD simulations performed analogous to our previous studies [51,53], yielding snapshots of the AS10 complexes of PAP(185− 208), CYP2W1(176− 204), and NDUFS7 (104− 129) which were generally in agreement with the experimental constraints (Fig. 9).Files containing coordinates of the ten snapshots of lowest energy for each complex are available in the supplementary material of this article.
Taken together, the characterization of the test set of eight protein segments identified in the prediction showed that the majority of them formed ThT-positive structures.For four of the eight segments, β-hairpin formation in complex with AS10 analogous to that observed for Aβ, αSyn and IAPP was confirmed.

PAP(185-208) triggers semen amyloid formation
The herein identified protein segments with capacity to adopt β-hairpin conformation are usually incorporated into the native fold of their source protein, which prohibits amyloid formation (Fig. 4B).However, when liberated from the native fold, they might be able to form amyloid and gain new biological activities.For example, prostatic acid phosphatase (PAP) is synthesized in the prostatic gland and secreted into the semen, where it is proteolytically cleaved into many fragments [58].Two of these fragments, PAP(248− 286) and PAP (85− 120) are known to exist as amyloids in the seminal plasma [59,60].Similarly, semenogelin-1 (SEM1) and semenogelin-2 (SEM2) are cleaved into fragments that also form semen amyloids [61,62].The semen amyloids belong to the class of functional amyloid, as they fulfill beneficial roles in the selection of healthy sperms [63].

Discussion
In this study we aimed to identify protein segments with similarity to Aβ, αSyn and IAPP, three highly amyloidogenic proteins with important roles in protein misfolding diseases.Starting point was the common formation of β-hairpins in all three proteins in regions that are critical for amyloid formation [37][38][39].These β-hairpins are stabilized upon binding to engineered binding proteins, have a common architecture, and one binding protein, β-wrapin AS10, binds all three proteins with sub-micromolar affinity [43].The search for proteins with propensity to form similar β-hairpins might identify further molecules that contribute to disease processes and may improve our understanding of the roles of β-hairpins in amyloid formation.By combining the capacity of the TANGO algorithm to predict APRs with the general architecture of β-hairpins and some specific features of the amino acid composition of the Aβ, αSyn and IAPP β-hairpins, we filtered the human proteome and obtained 2505 potential β-hairpin regions distributed over 2098 proteins.The characterization of the test set of eight protein segments showed that this approach could indeed identify protein segments that form amyloid and can adopt β-hairpin conformation.The test set included one peptide, NBPF1(928− 946), that did not form amyloid and remained soluble, showing that the filtered sequences have to be analyzed case-by-case.The test data set was not large enough to derive further factors deciding if or if not a sequence will form ThT-positive aggregates and bind to AS10 in a β-hairpin conformation.However, the wide range of pI values of the identified peptides indicates that no specific charge state is required (Fig. 4A).The peptides within the test set with highest affinity for AS10 achieved a K D of 5 µM, which is an order of magnitude higher than the K D s of Aβ, αSyn and IAPP [43].The higher affinity of Aβ and αSyn is not surprising, as the generation of AS10 included steps for selection for Aβ binding and affinity maturation for αSyn.The high affinity of IAPP, on the other hand, is remarkable, as it was not involved in the selection [43].
The search strategy for amyloidogenic β-hairpins was governed by sequence similarity with the β-hairpin segments of Aβ, αSyn and IAPP.
Altered search criteria, such as higher or lower TANGO aggregation propensities, different lengths of β-strand or turn regions, or different amino acid compositions, will result in variable lists of protein segments, which might be as relevant or even more relevant than the list generated here.As a matter of principle, a potential relevance of the identified β-hairpin segments has to be evaluated case-by-case.The segments are usually incorporated into the source protein structure, lacking that conformational flexibility that is required for folding into a β-hairpin or into an amyloid structure (Fig. 1).However, the segments may gain this flexibility during their lifetimes, as described above for PAP, which is cleaved into fragments by proteases in the seminal fluid.PAP fragments that include the segment PAP(185− 208) are likely present in the seminal fluid, where they may trigger the formation of semen amyloids built from PAP(248− 286) and SEM1(86− 107), as suggested by the aggregation assay data (Fig. 10).Investigation of the two APR peptides that together constitute the PAP(185− 208) β-hairpin segment revealed that the individual APRs were both able to trigger semen amyloid formation, i.e., PAP(185− 208) combines two β-strand peptides that can both trigger cross-amyloid formation on their own (Fig. 11).Further case-bycase experiments are required to elucidate which potential novel properties the β-hairpin segments gain by combining two APRs.Interestingly, the ability of PAP Str1(186− 194) to form ThT-positive aggregates on its own (Fig. 11B) does not seem to be required for triggering heterologous amyloid formation, as the complete β-hairpin peptide PAP(185− 208) does not form ThT-positive aggregates alone but still promotes semen amyloid formation.
In conclusion, this study has identified a large number of potential β-hairpin-forming amyloidogenic protein segments in the human proteome which may have important roles in amyloid formation and which may serve as a resource to study general and specific effects of β-hairpin conformers on amyloid formation.

Fig. 1 .
Fig. 1. : Schematic of the potential involvement of β-hairpin conformers (red) in the amyloid formation reaction.(A) Comparison of β-hairpin with β-strand-turnβ-strand motif in amyloid fibrils.Backbone hydrogen bonds are intramolecular in β-hairpins but intermolecular in typical amyloid fibrils, resulting in antiparallel (β-hairpin) vs. parallel (amyloid fibril) β-sheets.For the β-strand-turn-β-strand motif selected side chains are displayed as spheres to illustrate intramolecular side chain-side chain interactions that stabilize amyloid fibrils in a steric zipper-like fashion.(B) β-Hairpins may have multiple effects in amyloid formation pathways.

Fig. 3 .
Fig. 3. Criteria and workflow for prediction of amyloidogenic β-hairpins.(A) Protein segments were searched for that contained two segments of high β-aggregation propensity and an amino acid composition similar to the β-hairpins of Aβ, αSyn, and IAPP.(B) Numbers of protein sequences screened, predicted, and tested.

Fig. 4 .
Fig. 4. Test set of eight protein segments.(A) List of the investigated peptides.Underlined stretches in the peptide sequences correspond to the predicted amyloidogenic β-hairpins.A peptide was considered 'soluble' when no visible precipitation occurred upon dissolution of the lyophilized peptide in 20 mM NaPi, pH 7.4, 50 mM NaCl, to a final concentration of 20 µM.A peptide was considered 'ThT' positive, when increased ThT fluorescence was observed in aggregation assays performed in the peptide concentration range 20-550 µM for up to 85 h in 20 mM NaPi, pH 7.4, 50 mM NaCl.A peptide was considered to form an 'NMR β-hairpin' when the spectral fingerprint of β-hairpin formation was observed upon mixing 100 µM 15 N-AS10 with a 20% molar excess of the respective peptide.Affinities were not determined ('n.d.') for peptides exhibiting precipitation.(B) Conformations of investigated segments (red) in the context of their source proteins.PDB codes 2XVS (TTC5), 1FZD (FibA), 1ND6 (PAP), 5XTB (NDUFS7), 6RJ3 (3-PGDH).

Fig. 9 .
Fig. 9. Computationally predicted structural models of β-hairpins in complex with AS10.Models were initially built based on the common β-sheet architecture of the newly identified β-hairpin complexes and the known β-hairpin complexes of Aβ, αSyn and IAPP.The 10 structures of lowest energy in subsequent MD simulations are displayed.The corresponding coordinate files are available in the supplementary material.