1/*   orgref_fixes
    2     Author: Christian Gimenez.
    4     Copyright (C) 2020 Christian Gimenez
    6     This program is free software: you can redistribute it and/or modify
    7     it under the terms of the GNU General Public License as published by
    8     the Free Software Foundation, either version 3 of the License, or
    9     at your option) any later version.
   11     This program is distributed in the hope that it will be useful,
   12     but WITHOUT ANY WARRANTY; without even the implied warranty of
   14     GNU General Public License for more details.
   16     You should have received a copy of the GNU General Public License
   17     along with this program.  If not, see <http://www.gnu.org/licenses/>.
   19     11 Jul 2020
   23:- module(orgref_fixes, [
   24              fix_citations/4,
   25              fix_references/3,
   26              fix_all/4,
   28              generate_bibliography/3,
   29              generate_cites/4
   30	  ]).

orgref_fixes: Fix an org file with org-ref citations and references.

- Christian Gimenez */
- GPLv3
   38:- license(gplv3).   39
   40:- use_module(library(dcg/basics)).   41:- use_module(orgref_search).   42:- use_module(library(bibtex)).   43:- use_module(library(bibtex_fields)).
 fix_citations(+Org:term, +Bibtex:term, +Html:string, -Html_output:string)
Fix citation links and insert the bibliography at the end.
Org- The org file path.
Bibtex- The bibtex file path.
Html- the Html export string. Use read_file_to_string/3 to read a whole file. */
   59fix_citations(Org, Bibtex, Html, Html_output) :-
   60    read_file_to_string(Org, OrgText, []),
   62    search_citations(OrgText, Lst_cites),
   63    generate_cites(Bibtex, Lst_cites, Lst_citemaps, Lst_entries),
   65    html_fix_citations(Lst_citemaps, Html, Html1),
   66    insert_bibliography(Lst_entries, Html1, Html_output).
 fix_references(Org, Html, Html_output)
Fix references to a figure, table and/or section.
To be done
- */
   75fix_references(_Org, Html, Html).
   76% search_refs(Org, Lst_refs),
   77% replace_refs(Lst_refs, Html).
 fix_all(+Org_file:term, +Bibtex:term, +Html_file:term, +Result_file:term)
Fix the following items on the exported Html_file:
@param Org_file The org file path. @param Bibtex The bibtex file path. @param Html_file The HTML exported file. @param Result_file The HTML output file which will be created. */
   92fix_all(Org_file, Bibtex, Html_file, Result_file) :-
   93    read_file_to_string(Html_file, Html, []),
   95    fix_references(Org_file, Html, Html1),
   96    fix_citations(Org_file, Bibtex, Html1, Html2),
   98    tell(Result_file),
   99    write(Html2),
  100    told.
From a BibTeX author value return the abbreviated string, suitable for an APA-styled reference.

For example: from "Tania Tudorache and Csongor Nyulas and Natalya Fridman Noy and Mark A. Musen" return "Tudorache <i>et al.</i>".

The "et al." is added when more than one author is founded. Else, the author surname is used. */

  118et_alii(Author_value, Abbrv) :-
  119    author_field(field(author, Author_value), Authors),
  120    et_alii_int(Authors, Abbrv), !.
  122et_alii_int([author(Surname, _Name)], Value) :-
  123    format(string(Value), "~s", [Surname]), !. % red cut.
  124et_alii_int([author(Surname, _Name)|_Rest], Value) :-
  125    format(string(Value), "~s <i>et al.</i>", [Surname]), !. % red cut.    
Parse the next character and return the type of accent founded.
Accent_type- A string with the accent type. */
  134accent("'") --> [39].
  135accent("`") --> [96].
  136accent("\"") --> [34].
  137accent("~") --> `~`.
  138accent("c") --> `c`.
 accented_vowel(+Accent_type:string, -Accented_vowel:codes)//
Parse the next vowel and add the accent in Accent_type. Accent_type can be obtained from accent//1.
Accent_type- The LaTeX accent type. For instance, for the LaTeX "\'" accent use Accent_type = "'".
Accented_vowel- The resulting code: the vowel with the accent in one character.
See also
- accent//1 */
  153accented_vowel("'", `á`) --> "a".
  154accented_vowel("'", `é`) --> "e".
  155accented_vowel("'", `í`) --> "i".
  156accented_vowel("'", `ó`) --> "o".
  157accented_vowel("'", `ú`) --> "u".
  158accented_vowel("'", `Á`) --> "A".
  159accented_vowel("'", `É`) --> "E".
  160accented_vowel("'", `Í`) --> "I".
  161accented_vowel("'", `Ó`) --> "O".
  162accented_vowel("'", `Ú`) --> "U".
  163accented_vowel("`", `à`) --> "a".
  164accented_vowel("`", `è`) --> "e".
  165accented_vowel("`", `ì`) --> "i".
  166accented_vowel("`", `ò`) --> "o".
  167accented_vowel("`", `ù`) --> "u".
  168accented_vowel("`", `À`) --> "A".
  169accented_vowel("`", `È`) --> "E".
  170accented_vowel("`", `Ì`) --> "I".
  171accented_vowel("`", `Ò`) --> "O".
  172accented_vowel("`", `Ù`) --> "U".
  173accented_vowel("\"", `ä`) --> "a".
  174accented_vowel("\"", `ë`) --> "e".
  175accented_vowel("\"", `ï`) --> "i".
  176accented_vowel("\"", `ö`) --> "o".
  177accented_vowel("\"", `ü`) --> "u".
  178accented_vowel("\"", `Ä`) --> "A".
  179accented_vowel("\"", `Ë`) --> "E".
  180accented_vowel("\"", `Ï`) --> "I".
  181accented_vowel("\"", `Ö`) --> "O".
  182accented_vowel("\"", `Ü`) --> "U".
  183accented_vowel("~", `ñ`) --> "n".
  184accented_vowel("~", `Ñ`) --> "N".
  185accented_vowel("c", `ç`) --> "c".
DCG rule to parse a LaTeX fragment and generate a plain text without:
  201parse_latex([B|Rest]) -->
  203    `\\`, accent(A), `{`, accented_vowel(A, [B]), `}`, !,
  204    parse_latex(Rest)
  205parse_latex([B|Rest]) -->
  207    `\\`, accent(A), accented_vowel(A, [B]), !,
  208    parse_latex(Rest)
  209parse_latex(Text) -->
  211    `{`, parse_latex(Nested), `}`, !, parse_latex(Rest),
  212    {append(Nested, Rest, Text)}
  213parse_latex([32|Rest]) -->
  215    blank, blanks, parse_latex(Rest)
  216parse_latex([C|Rest]) -->
  218    [C], parse_latex(Rest),
  219    {[C] \= `}`}, !
  220parse_latex([]) --> [].
Convert LaTeX accents, remove curly brackets and excesive spaces into text. For instance, convert the following LaTeX text
La   herramienta {Prot\'eg{\'{é}}} es utilizada para
        la   {W}eb  {Sem\'{a}ntica}.

into a plaint text as follows:

La herramienta Protégé es utilizada para la Web Semántica.

This is useful for generating the bibliography or reference text from the BibTeX's author names.

Latex- A string or codes to convert.
Text- The results of the convertion. */
  246latex2text(Latex, Text) :-
  247    is_of_type(string, Latex), !,
  249    string_codes(Latex, Latex_codes),
  250    phrase(parse_latex(Text1), Latex_codes),
  251    string_codes(Text, Text1).
  253latex2text(Latex, Text) :-
  254    is_of_type(codes, Latex),
  255    phrase(parse_latex(Text1), Latex),
  256    string_codes(Text, Text1).
 get_field(+FieldName:atom, +Fields:list, -Value:string)
Return the field value from an entry/3's field/2 data.

This predicate always is succeds.

FieldName- The field name to retrieve. Ex.: author.
Fields- The list of field/2 terms.
Value- The string value. An empty string if the field does not exists. */
  269get_field(FieldName, Fields, Value) :-
  270    member(field(FieldName, Value), Fields), !. 
  271get_field(_Fieldname, _Fields, ""). % <- Couldn't find this field
 generate_cite(-Entry:term, -Cite_map:term) is det
Generate the cite/2 term from the given entry. The term consist of: cite(Entry_label: string, Reference_string: string).

For example, it generate the following "citemap" from an entry:

cite("schneider73:_cours_modul_applied", "(Schneider, Edward W, 1973)").
Entry- An entry/3 term.
Cite_map- A cite/2 term. */
  291generate_cite(entry(_Key, Label, Fields), cite(Label, Str)):-
  292    get_field(author, Fields, Author_value),
  293    latex2text(Author_value, Author_value2),
  294    et_alii(Author_value2, Auth_abbrv),
  295    get_field(year, Fields, Year),
  296    format(string(Str), "(~s, ~s)", [Auth_abbrv, Year]).
 fold_mapcites_pred(+Entry:term, +Lst_prev:list, -Lst_next:list)
Predicate used by a foldl/4 call. Generate the "citemap" from the given Entry and append it to the previous list Lst_prev.
Entry- An entry/3 term.
Lst_prev- A list of cite/2 citemaps.
Lst_next- A list of resulting cite/2 citemaps. */
  308fold_mapcites_pred(Entry, Lst_prev, [Map|Lst_prev]) :-
  309    generate_cite(Entry, Map).    
 generate_mapcites(+Lst_entries:list, -Lst_maps:list)
Generate citemaps from the given entry. Each citemap is used to generate the citation references in the middle of the text.

For instance, if a bibtex key is founded in the middle of the text, it should be replaced by the correct APA (or other style) citation reference.

Lst_entries- A list of entry/3 terms given by any bibtex predicate library.
Lst_maps- A list of cite/2 terms. */
  324generate_mapcites(Lst_entries, Lst_maps) :-
  325    foldl(fold_mapcites_pred, Lst_entries, [], Lst_maps).
 generate_cites(+Bibtex:term, +Lst_cites:list, -Map:list, -Lst_entries:list)
Given a list of bibtex citation keys, search each entry on the Bibtex file and create the citation maps.
Bibtex- The bibtex file path.
Lst_cites- A list of strings with cite keys.
Map- A cite/2 map between each key and the APA reference string.
Lst_entries- The list of bibtex entry/3 entries. */
  338generate_cites(Bibtex, Lst_cites, Map, Lst_entries) :-
  339    bibtex_get_entries(Bibtex, Lst_cites, Lst_entries),
  340    generate_mapcites(Lst_entries, Map).
 replace_cites(+Citemap:term, +Html:string, -Html_output:string)
For the Citemap = cite(Citekey, Labelstr), replace every anchor with href to with a proper formatted anchor with the Labelstr as text.
Citemap- A cite/2 mapping cite(+Citekey:string, +Labelstr:string).
Html- The org export as string.
Html_output- The replaced org. */
  352replace_cites(cite(Key, Str), Html, Html_output) :-
  353    format(string(Regexp), '<a href="([^"]+)">~s</a>', [Key]),
  354    format(string(Result),
  355           '<a href="#\\1" class="cite-link" data-ref="~s">~s</a>',
  356           [Key, Str]),
  357    re_replace(Regexp/g, Result, Html, Html_output).
 html_fix_citations(+Lst_citemaps:list, +Html:string, -Html_output:string)
Fix the citation references.
Lst_citemaps- A list of cite/2 mappings.
Html- A string with the exported org file.
Html_output- A string with the fixed anchors. */
  368html_fix_citations(Lst_citemaps, Html, Html_output) :-
  369    foldl(replace_cites, Lst_citemaps, Html, Html_output).
 process_field(+Fields:list, -Text:string)
Create the bibliography line by using the BibTeX field list.

For example, from an entry(Key, Label, Fields) create the following APA-styled bibliography line format:

AuthorSurname, AuthorName. Title (Year).

The fields taken from the Fields list are the following:

@param Fields The field list from the BibTeX entry/3 term. @param Text The string produced. */
  396process_field(Fields, Html) :-
  397    get_field(author, Fields, Author),
  398    get_field(year, Fields, Year),
  399    get_field(title, Fields, Title),
  401    latex2text(Author, AuthorText),
  402    latex2text(Title, TitleText),
  404    format(string(Html), '~s. ~s (~s).', [
  405               AuthorText, TitleText, Year
  406           ]).
 generate_html(+Entry:term, -Html:string)
Generate an HTML bibliography div from a BibTeX entry. The format produced is an HTML as follows:
    <a name="Entry_label" class="org-bibitem" ></a>
    <p class="org-bibitem">
      -- APA-styled bibliography reference here --
Entry- A BibTeX entry/3 as the one produced by the bibtex library.
Html- The HTML filled with the BibTeX entry/3 data as described before. */
  426generate_html(entry(_Key, Label, Fields), Html) :-
  427    process_field(Fields, Fields_HTML),
  428    format(string(Html),
  429           '  <div>
  430    <a name="~s" class="org-bibitem" ></a>
  431    <p class="org-bibitem">
  432      ~s
  433    </p>
  434  </div>
  436           [
  437               Label,
  438               Fields_HTML
  439           ]).
 fold_bibs_pred(+Entry:term, +PrevHtml:string, -NextHtml:string)
Internal predicate. It is used by a foldl/4 predicate to generate all the bibliography HTML items.

Generate a bibliography APA-styled text from the given Entry. Use a HTML syntax.

Entry- A BibTeX entry/3 term where to take the author, title and year for the bibliography text.
PrevHtml- The previous HTML string generated by the previous call.
NextHtml- The output HTML for the next call.
See also
- generate_bibs_html/2 */
  457fold_bibs_pred(Entry, PrevHtml, NextHtml) :-
  458    generate_html(Entry, Html),
  459    string_concat(PrevHtml, Html, NextHtml).
 generate_bibs_html(+Lst_entries:list, -Html:string)
Generate an APA-styled bibliography from the given entry/3 BibTeX list. The output is formated using HTML syntax.
Lst_entries- A list of entry/3 BibTeX entries.
Html- The HTML bibliography text generated in HTML format. */
  470generate_bibs_html(Lst_entries, Html) :-
  471    foldl(fold_bibs_pred, Lst_entries, "\n<div class=\"org-bib\">\n", Html1),
  472    string_concat(Html1, "</div><!-- /org-bib -->\n", Html).
 generate_bibliography(+Bibtex:term, +Lst_cites:list, -Html:string)
Generate an APA-styled bibliography in HTML syntax from a list of citation labels.
Bibtex- The BibTeX file path.
Lst_cites- A list of strings. These must be citation labels (ex. ["giese15:optique", "chang18:_scaling_knowled_access"]).
Html- The HTML bibliography text generated in HTML format. */
  485generate_bibliography(Bibtex, Lst_cites, Html) :-
  486    bibtex_get_entries(Bibtex, Lst_cites, Lst_entries),
  487    generate_bibs_html(Lst_entries, Html).
 replace_printbibliography(+Html:string, +Bib_html:string, -Html_output:string)
Find all \printbibliography text in Html and replace it with Bib_Html.
Html- The input HTML text.
Bib_html- The bibliography in HTML format.
Html_output- The output HTML text */
  499replace_printbibliography(Html, Bib_html, Html_output) :-
  500    re_replace('\n[[:space:]]*\\\\printbibliography[[:space:]]*\n'/g,
  501               Bib_html, Html, Html_output).
 insert_bibliography(+Lst_entries:list, +Html:string, -Html_output:string)
Replace the \printbibliography text in the Html string with the bibliography generated from the BibTeX entries given.

The bibliography text used is an HTML produced by the generate_bibs_html/2 predicate.

Lst_entries- A list of BibTeX entry/3 terms as returned by bibtex_get_entries/3 from the bibtex library.
Html- The HTML input.
Html_output- The HTML output with the bibliography inserted where the \printbibliography text were. */
  518insert_bibliography(Lst_entries, Html, Html_output) :-
  519    generate_bibs_html(Lst_entries, Bib_html),
  520    replace_printbibliography(Html, Bib_html, Html_output)