diff --git a/docs/contribute.md b/docs/contribute.md index aa03297..9534231 100644 --- a/docs/contribute.md +++ b/docs/contribute.md @@ -1,4 +1,76 @@ -# How to contribute to source code and documentation? +# How to contribute to YaCy + + +## Be active in the forum + +[Community forum](https://community.searchlab.eu) keeps the community alive. +If you're advanced user, good place to start is helping the others in +the forum and sharing your knowledge there. + + +## Report a bug + +If you find a bug or you want to suggest an improvement, fill the +[github issue](https://github.com/yacy/yacy_search_server/issues). + +Please use factual and technical language and try to describe the bug in +details. + +Focus on what behavior you did expect and what YaCy did instead of it. + +See [the log](operation/logging.md) and try to find and attach the +appropriate log lines. Increase the log +[verbosity](operation/logging.md#verbosity), if neccessary. + +Be sure, that the issue could be replicated (describe what to do to see +the malfunction again). Problem happening just once, without the possiblity +to make it happen again, could be hardly fixed. + +Pace of issue-fixing is currently quite low, so if you can, repair the +function yourself and do a github pull-request (see below). + + +## Help writting documentation + +YaCy has many more functions than described in the [documentation](docs.md) +and is still heavily under-documented. You can help others by writting about your +favourite feature, updating the old pages, fixing installation guide for +your platform, correcting the mistakes or just spell-checking. See +[github repository of documentation](https://github.com/yacy/yacy_net_homepage), +[documentation issues](https://github.com/yacy/yacy_search_server/issues?q=state%3Aopen%20label%3ADocumentation) +and github guide bellow. + + +## Help developing YaCy + +If you're a Java wizard, you're most warmly welcomed to be part of the +development! + +Although YaCy is developed [since 2003](https://en.wikipedia.org/wiki/YaCy), and still maintained, +progress is sometimes slow. Therefore chances for including your code to +mainstream are very high. + +You can pick an [issue](https://github.com/yacy/yacy_search_server/issues) to solve. +They're well tagged with [labels](https://github.com/yacy/yacy_search_server/labels) such as: +[good first issue](https://github.com/yacy/yacy_search_server/issues?q=state%3Aopen%20label%3A%22good%20first%20issue%22), +[bug](https://github.com/yacy/yacy_search_server/issues?q=is%3Aissue%20state%3Aopen%20label%3Abug), +[crawler](https://github.com/yacy/yacy_search_server/issues?q=state%3Aopen%20label%3Acrawler), +[search](https://github.com/yacy/yacy_search_server/issues?q=state%3Aopen%20label%3Asearch), +[index](https://github.com/yacy/yacy_search_server/issues?q=state%3Aopen%20label%3Aindex), +[network](https://github.com/yacy/yacy_search_server/issues?q=state%3Aopen%20label%3Anetwork), +[releasing](https://github.com/yacy/yacy_search_server/issues?q=state%3Aopen%20label%3Areleasing), +[developer](https://github.com/yacy/yacy_search_server/issues?q=state%3Aopen%20label%3A%22developer%20issue%22), +etc. + +You can improve what annoys you, personally, or craft a feature you like. + +Before adding a major feature, consult +[@orbiter](https://github.com/Orbiter), the main developer, or +[the forum](https://community.searchlab.eu/). + + + +## Step by step guide for Github Basically, your contribution to the code and documentation is possible using github.com. Create account there, fork the official repository, clone it to @@ -6,22 +78,20 @@ your local machine, make a branch, modify files, commit changes to github. Finally make a pull request, so your contribution could be merged into master branch. -Step by step guide: -(or follow the github contribution guide -https://docs.github.com/en/get-started/quickstart/contributing-to-projects) +(or follow the [github contribution guide](https://docs.github.com/en/get-started/quickstart/contributing-to-projects)) -## this is for the first time only +### this is for the first time only * create a github.com account * log into github -* fork repository https://github.com/yacy/yacy_search_server for a yacy +* fork repository for a yacy software modification, - or https://github.com/yacy/yacy_net_homepage for documentation editing, + or for documentation editing, respectively (use a 'fork' button on top right part of github.com) * your own fork is now at url: @@ -29,8 +99,7 @@ https://docs.github.com/en/get-started/quickstart/contributing-to-projects) or: https://github.com/YOURUSERNAMEHERE/yacy_search_server -* set-up a ssh key using this guide: - https://docs.github.com/en/authentication/connecting-to-github-with-ssh/adding-a-new-ssh-key-to-your-github-account +* set-up a ssh key using [this guide](https://docs.github.com/en/authentication/connecting-to-github-with-ssh/adding-a-new-ssh-key-to-your-github-account) * on your local machine, clone your new fork using commandline: ``` @@ -48,7 +117,7 @@ https://docs.github.com/en/get-started/quickstart/contributing-to-projects) git config --global user.name "YOURUSERNAMEHERE" ``` -## this is the regular workflow: +### for every next change * create a new branch under your working directory: ``` @@ -77,9 +146,9 @@ https://docs.github.com/en/get-started/quickstart/contributing-to-projects) to the master yacy branch. * you can see all open pull request by you and other people at: - https://github.com/yacy/yacy_search_server/pulls + or - https://github.com/yacy/yacy_net_homepage/pulls + * wait until the mantainers merge your pull request diff --git a/docs/docs.md b/docs/docs.md index 12b6f56..839238e 100644 --- a/docs/docs.md +++ b/docs/docs.md @@ -26,10 +26,11 @@ ## Operation -* [Index Creation - Crawl Start](operation/crawlstart_p.md) +* [Advanced Crawler](operation/crawlstart_p.md) * [Setting the ranking rules](operation/ranking.md) * [YaCy config settings](operation/yacy_conf.md) * [RWI index distribution](operation/rwi-index-distribution.md) +* [Yacy Packs](operation/yacy-pack.md) * [Autoupdate](operation/autoupdate.md) * [Portforwarding](operation/portforwarding.md) * [Using the YaCy Front-End over HTTPS](operation/yacyoverhttps.md) @@ -52,7 +53,8 @@ ## Converted from old-wiki may be outdated, you can help the community by checking and [improving](contribute.md) the pages -... +* [Index Export and Import](operation/index-export-import.md) +* [Advanced Search Parameters](operation/search-parameters.md) ## Old and obsolete The original YaCy wiki is closed now (no new registration or editing) and diff --git a/docs/download_installation.md b/docs/download_installation.md index 6a97168..78aad7d 100644 --- a/docs/download_installation.md +++ b/docs/download_installation.md @@ -23,7 +23,7 @@ If you don't have Docker installed, get it from [https://docs.docker.com/get-doc * Download YaCy for Windows from [https://download.yacy.net/yacy_v1.924_20201214_10042.exe](https://download.yacy.net/yacy_v1.924_20201214_10042.exe) -* Download Yacy for Linux from [https://download.yacy.net/yacy_v1.930_202405130205_59c0cb0f3.tar.gz](https://download.yacy.net/yacy_v1.930_202405130205_59c0cb0f3.tar.gz) +* Download Yacy for Linux from [https://download.yacy.net/yacy_v1.940_202405270005_70454654f.tar.gz](https://download.yacy.net/yacy_v1.940_202405270005_70454654f.tar.gz) * Download YaCy for macOS from [https://download.yacy.net/yacy_v1.930_202405130205_59c0cb0f3.dmg](https://download.yacy.net/yacy_v1.930_202405130205_59c0cb0f3.dmg) * Download latest developer release for Linux from [https://release.yacy.net/](https://release.yacy.net/) diff --git a/docs/img/indexexpimp1.jpg b/docs/img/indexexpimp1.jpg new file mode 100644 index 0000000..94b9f6d Binary files /dev/null and b/docs/img/indexexpimp1.jpg differ diff --git a/docs/img/indexexpimp2.jpg b/docs/img/indexexpimp2.jpg new file mode 100644 index 0000000..4a15706 Binary files /dev/null and b/docs/img/indexexpimp2.jpg differ diff --git a/docs/operation/index-export-import.md b/docs/operation/index-export-import.md new file mode 100644 index 0000000..34faed9 --- /dev/null +++ b/docs/operation/index-export-import.md @@ -0,0 +1,49 @@ +# Index export and import + +Since the development version 1.83 build 9250 YaCy has now the long +awaited feature to handle the index data in a more convenient way: An +ex- and import feature has been implemented. + +It is no longer necessary to study complicated manuals for the merge of +two solr indexes and/or setup an additional stand-alone instance for +index merging tasks - thanks to the great work of Orbiter - YaCy +delivers now a powerful ex- and import feature out-of-the-box\! + +## How to do that? + +Here's a short tutorial: + +![Index Export screenshot](../img/indexexpimp1.jpg)] + +1\. On the machine you want to export the index data open a browser and +navigate to + +Leave the settings as is, XML (Rich and full-text Solr data, one +document per line in one large xml file, can be processed with shell +tools, can be imported with `DATA/SURROGATE/in/`) because it's the best +choice for the consistency of your data. + +2\. Press the 'Export URLs'-Button and grab some coffee :-) + +![Index Export screenshot](../img/indexexpimp2.jpg) + +3\. On the machine you want to import the index data, simply put the +exported XML file into the following subdirectory of YaCy: `\DATA\SURROGATES\in` + +You can do this during YaCy is running - No need to shut it down first\! + +4\. Voilà - The import process starts automatically and is blazing fast +- even on older machines. After the import process is completed, you can +search trough the data instantly - reindexing is superfluous. + + + + + +_Converted from +„“, may be +outdated_ + + + + diff --git a/docs/operation/search-parameters.md b/docs/operation/search-parameters.md new file mode 100644 index 0000000..fcddc0e --- /dev/null +++ b/docs/operation/search-parameters.md @@ -0,0 +1,99 @@ +# Using Advanced Search Parameters + +Most search requests usually contain just one word. If more than one +word is used, all words will be assumed to be part of an AND relation. +Besides searching for certain words with YaCy, you can use more advanced +methods to put a search request: + +## Excluding a Word + +To exclude a word from a search, a minus (`-`) can be used: for example, +if searching for *jaguar* produces too many results associated with cars +when in fact you are looking for the animal, searching for `jaguar -car` +might lead to better results. + +## NEAR + +`NEAR` can be used to rank results higher if search words appear in the +text close to each other. Example: `apache server NEAR`. + + It does not matter where NEAR is located in the search term. `apache +server NEAR` and `apache NEAR server` should return the same results. + +## site: + +`findsomething site:yacy.net` will limit the results to the domain +yacy.net, subdomains excluded. See `tld:` operator also. + +## tld: + +`findsomething tld:co.uk` will limit the results to domains ending with +`*.co.uk.*` This can also be used to search on subdomains. + +## inurl: + +`findsomething inurl:source` will limit the results to URLs which +contain the phrase "source". + +## filetype: + +`findsomething filetype:pdf` will limit the results to URLs which end +with `.pdf`. + +## LANGUAGE: + +`findsomething LANGUAGE:en` will rank results in English language +higher. (Note: Language detection is still very experimental\!) + +## RECENT + +`findsomething RECENT` will rank recently crawled pages higher. + +## Protocol + +`findsomething /ftp` will limit the results to URLs with FTP protocol +(ftp://). List of available protocol: `/https` `/http` `/ftp` `/smb` or +`/file` + +## author: + +`findsomething author:busch` will limit the results to URLs with author +"busch". `findsomething author:(Wilhelm busch)` will limit the results +to URLs with author "Wilhelm busch". + +## Date Search + +To search for results mentioning a specific date. + +*Note: to support date search [solr index field](../dev/solr-schema.md#optional-but-recommended) `dates_in_content_dts` must be +switched on* + + +### on: + +`findsomething on:2016/01/01` will limit the results to URLs which +contain the given date in the content. + + +### from: to: + +`findsomething from:2016/01/01` will limit the results to URLs which +contain a date on or after the `from:` parameter. + +`findsomething to:2016/01/01` will limit the results to URLs which +contain a date on or before the `to:` parameter. + +Both can be combinded to limit results to the given date range +`findsomething from:2016/01/01 to:2016/12/31` + + + + + +_Converted from + may +be outdated_ + + + + diff --git a/docs/operation/yacy_conf.md b/docs/operation/yacy_conf.md index 7c80c6a..c273ea5 100644 --- a/docs/operation/yacy_conf.md +++ b/docs/operation/yacy_conf.md @@ -6,7 +6,9 @@ You can fine-tune various settings of YaCy in the config file You can also change all the settings in in administration interface: __Administration > System Administration > Advanced Properties__. -Some changes take effect only after restart. +Some changes take effect only after restart, some are rewritten. The safest +way of changing the settings in config file is with yacy not running: stop +yacy, change the settings, and start again. All the config file options and default values are listed in file `defaults/yacy.init`. @@ -18,12 +20,19 @@ Options are described on this page as: ## System + +### Network + +``host = 0.0.0.0`` +The network interface - this connector binds to as an IP address or a hostname + + ``port = 8090`` port number where the server should bind to -``port.ssl = 8443`` - optional ssl port (https port) the server should bind to +``port.ssl = 8443`` + optional ssl port (https port) the server should bind to ``port.shutdown = -1`` @@ -31,12 +40,10 @@ Options are described on this page as: ( -1 = disable use of a shutdown port, 8005 = recommended default ) - - - ``upnp.enabled = true`` use UPnP [true/false] + ``upnp.remoteHost = `` remote host on UPnP device (for more than one connection) @@ -52,6 +59,15 @@ to run yacy on port 8090, reachable from port 80, set `bindPort=8090`, (of course you need to customize the ips) +`staticIP=` +staticIP if you have a static IP, you can use this setting + + +`publicPort=` + if you use a different port to access YaCy than the one it listens on, you can use this setting + + + ### Paths settings `indexPrimaryPath=DATA/INDEX` The path to the public reverse word index for text files (web pages). @@ -144,22 +160,11 @@ because of limitations of the file system, the maximum size can be set here `filesize.max.other = 8589934591` -## Network - -### IP and port -`staticIP=` -staticIP if you have a static IP, you can use this setting - - -`publicPort=` - if you use a different port to access YaCy than the one it listens on, you can use this setting - - ### Network Definition There can be separate YaCy networks, and managed sub-groups of the general network. -The essentials of the network definition are attached in separate property files. +The essentials of the [network definition](network-definition.md) are attached in separate property files. The property here can also be a url where the definition can be loaded. @@ -181,7 +186,6 @@ This option is only valid if the `network.unit.domain` property is set to 'any'. - `network.unit.agent =` A client may have its own agent name. This name can be set by the user and is set to a random value if not overwritten by the user. As an alternative, the name can be set with this property. @@ -194,6 +198,7 @@ Prefer https for in-protocol operations when available on remote peers. A distinct general setting is available to control whether https sould be used for remote search queries : `remotesearch.https.preferred` + ### Clusters within a network Every network can have an unlimited number of clusters. Clusters may be also completely sealed and have no connection to other peers. When a cluster does not use the @@ -233,7 +238,7 @@ it can be rather short ### TLS/SSL support For a German manual see -http://yacy-websuche.de/wiki/index.php/De:Interface%C3%9CberHTTPS + English speaking user read below: