用Java编写自己的查找和替换Wikipedia_问答-阿里云开发者社区

这是我尝试的方法：https : //github.com/curiprogrammer/WikiToLaTeX/tree/master/src

以下文字为例：

<!-- Hinweis: Der Artikel [[Konkatenation (Formale Sprache)]] verweist auf diese Überschrift-->
Die '''Konkatenation''' zweier Sprachen <math>L_1</math> und <math>L_2</math> ist die Sprache der Wörter, die durch Hintereinanderschreibung ([[Konkatenation (Wort)|Konkatenation]]) je eines beliebigen Wortes <math>u</math> aus <math>L_1</math> und <math>v</math> aus <math>L_2</math> entsteht:

:<math>L_1 \circ L_2 := \{ uv \mid u \in L_1, v \in L_2 \}</math>.

So sind zum Beispiel die Konkatenationen von verschiedenen Sprachen über dem Alphabet <math>\Sigma = \{ a ,\, b \}</math>:

:<math>\{ a  \} \circ \{ ab \} = \{ aab \}</math>
:<math>\{ a ,\, bb \} \circ \{ aa ,\, b \} = \{ aaa ,\, ab ,\, bbaa ,\, bbb \}</math>
:<math>\{ abb ,\, bab \} \circ \{ \varepsilon ,\, aab ,\, bb \} = \{ abb ,\, bab ,\, abbaab ,\, babaab ,\, abbbb ,\, babbb \}</math>
Heinrich Scholz traf sich 1944 mit [[Konrad Zuse]], der im Zuge seiner Doktorarbeit an seinem [[Plankalkül]] arbeitete. Im März 1945 sprach ihm Scholz für die Anwendung seines Logikkalküls seine Anerkennung aus.<ref>[[Hartmut Petzold]],''Moderne Rechenkünstler. Die Industrialisierung der Rechentechnik in Deutschland.'' München, C.H. Beck Verlag, 1992.</ref>

我想将如上所述的Wikipedia代码转换为LaTeX代码。为此，我需要删除一些单词或替换其他单词。示例：将每个：替换为\ begin {equation} + \ n，然后将其替换为\ end {equation}。但是，如果前面没有冒号，则应始终为$，也应为$。然后是带有[[TEXT | text2]]的东西...应该单独将其转换为text2。

我真的不知道如何开始这个项目。Java中有一个.replaceAll（）函数。但这是行不通的，因为我需要上述情况。关于如何进行此项目的任何提示和想法？

提前致谢！

问题来源：Stack Overflow

import java.util.*; import java.util.regex.Pattern; import java.util.regex.Matcher; class Wandbox { public static void main(String[] args) { Scanner sc = new Scanner(System.in); String regex1 = "(<math>)(.*?)(</math>)"; Pattern p1 = Pattern.compile(regex1); String regex2 = "(:<math>)(.*?)(</math>)"; Pattern p2 = Pattern.compile(regex2); while (sc.hasNextLine()) { String line = sc.nextLine(); // remove html comment tag String replaced = line.replaceAll("", ""); Matcher m2 = p2.matcher(replaced); replaced = m2.replaceAll("\\\\\\\\begin\\{equation}\n$2\n\\\\\\\\end\\{equation\\}\n"); Matcher m1 = p1.matcher(replaced); replaced = m1.replaceAll("\\$$2\\$"); System.out.println(replaced); } } }

探索云世界

热门

云计算

大数据

云原生

人工智能

数据库

开发与运维

活动广场

任务中心

训练营

直播

乘风者计划

下载

镜像站

技术资料

用Java编写自己的查找和替换Wikipedia

相关文章